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' Abstract 

We find scaling limits for the sizes of the largest components at criticality for rank-1 inhomogeneous 
random graphs with power-law degrees with power-law exponent r. We investigate the case where 
t 6 (3,4), so that the degrees have finite variance but infinite third moment. The sizes of the largest 
p/j ■ clusters, rescaled by n~^' r ~ 2 '^' r ~ 1 \ converge to hitting times of a 'thinned' Levy process, a special case 

of the general multiplicative coalescents studied by Aldous and Limic in [2] and [4] . 

Our results should be contrasted to the case where the degree exponent r satisfies r > 4, so that 
the third moment is finite. There, instead, we see that the sizes of the components rescaled by n~ 2 / 3 
converge to the excursion lengths of an inhomogeneous Brownian motion, as proved in [2] for the 
Erdos-Renyi random graph and extended to the present setting in [8, 35]. 
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1 Introduction 



The critical behavior of random graphs has received tremendous attention in the past decades. The 
simplest example of a random graph is the Erdos-Renyi random graph, for which its critical nature is 
by now well established (see e.g., the monographs [2, 5, 10, 19, 29] and the references therein). In the 
past years, many examples of real- world networks have been found where the degrees are highly variable 
and heavy-tailed, unlike the degrees in the Erdos-Renyi random graph, which instead are extremely light- 
tailed. As a result, there has been a concerted effort to define and analyze models for such real- world 
networks. See e.g., [1, 18, 32] for major reviews of real- world networks and models for them. 

In this paper, we study how inhomogeneity in the random graph model changes the critical nature 
of the random graph. In our model, the vertices have a weight associated to them, and the weight of a 
vertex moderates its degree. Therefore, by choosing these weights appropriately, we can generate random 
graphs with highly variable degrees. For our class of random graphs, it is shown in [24, Theorem 1.1] that 
when the weights do not vary too much, the critical behavior is similar to the one in the Erdos-Renyi 
random graph. See in particular the recent works [8, 35], where it was shown that if the degrees have 
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finite third moment, then the scaling limit for the largest critical components in the critical window are 
essentially the same as for the Erdos-Renyi random graph, as identified by Aldous in [2]. 

Interestingly, in [24, Theorem 1.2], it was shown that when the degrees have infinite third moment, 
then the sizes of the largest critical clusters are quite different. See also [22] for a related result for 
the configuration model, another random graph model having flexible degrees. In this paper, we bring 
this discussion substantially further by identifying the scaling limits of the largest critical clusters in the 
critical window in the regime where the degrees have infinite third moments. As we shall see, this scaling 
limit is rather different compared to that for the Erdos-Renyi random graph. We shall now first introduce 
the model that we shall investigate. 



1.1 Model 

In our random graph model, vertices have weights, and the edges are independent with the edge probability 
being approximately equal to the rescaled product of the weights of the two end vertices of the edge. While 
there are many different versions of such random graphs (see below), it will be convenient for us to work 
with the so-called Poissonian random graph or Norros-Reittu model [33] . To define the model, we consider 
the vertex set [n] := {1, 2, . . . , n} and suppose each vertex is assigned a weight, vertex i having weight Wi. 
Now, attach an edge with probability p^ between vertices i and j, where 

Pij = 1 - exp (-WiWj/in) , (1.1) 

with i n denoting the total weight 

£n=Yl W i- ( L2 ) 

Different edges are independent. In this model, the average degree of vertex i is close to Wi, thus incor- 
porating inhomogeneity in the model. 

There are many adaptations of this model, for which equivalent results hold. Indeed, the model 
considered here is a special case of the so-called rank-1 inhomogeneous random graph introduced in great 
generality by Bollobas, Janson and Riordan in [11]. It is asymptotically equivalent with many related 
models, such as the random graph with given prescribed degrees or Chung-Lu model, where instead 

Pij = m&x{wiWj/l n , 1}, (1.3) 

and which has been studied intensively by Chung and Lu (see [13, 14, 15, 16, 17]). A further adaptation 
is the generalized random graph introduced by Britton, Deijfen and Martin-Lof in [12], for which 

t n + WiWj 

See Janson [28] for conditions under which these random graphs are asymptotic equivalent, meaning that 
all events have asymptotically equal probabilities. As discussed in more detail in [24, Section 1.3], these 
conditions apply in the setting to be studied in this paper. Therefore, all results proved here also hold 
for these related rank-1 models. 

Let us now specify how the weights are chosen. We let the weight sequence w = (wi) iG [ n ^ be defined 

by 

Wi = [l-F]-\i/n), (1.5) 

where F is a distribution function on [0, oo) for which we assume that there exists a t £ (3, 4) and 
< c F < oo such that 

lim x r_1 [l - F(x)] = c F , (1.6) 



r T " lr 

1-1 



and where [1 — F] is the generalized inverse function of 1 — F defined, for u £ (0, 1), by 

[l-F]' 1 ^) = inf{s: [1-F]{s) < u}. (1.7) 
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By convention, we set [1 — = 0. We often make use of the fact that, with U uniform on [0, 1], the 

random variable [1 — i ? ] _1 (?7) has distribution function F. 

An interpretation of the choice in (1.5) is that the weight of a vertex V n chosen uniformly in [n] has 
distribution function F n given by 



n-l 



F n (x) = F(w Vn <x) = ^Yl Mv s <x} = ~ E hli-Fl-Hfe*} = n E hll-Fl-Hi-i)^} 

je[n] je[n] i=0 

j n— 1 j n— 1 j 

- E V-(t)<*} = n S = JWJ + A < Lf 



i=Q i=0 

where, throughout this paper and for x,y £l, we write (x Ay) = max{x,y} and [x\f y) = min{x,y}. 
By (1.8), F n F uniformly. As a result, a uniformly chosen vertex has a weight which is close in 
distributional sense to F. 

For the setting in (1.1) and (1.5), by [11, Theorem 3.13], the number of vertices with degree k, which 
we denote by iVj., satisfies 

r W^i 

N k /n^E e~ w — , k > 0, (1.9) 
L k\ J 

where — > denotes convergence in probability, and where W has distribution function F appearing in (1.5). 
We recognize the limiting distribution as a so-called mixed Poisson distribution with mixing distribution 
F, i.e., conditionally on W = w, the distribution is Poisson with mean w. Equation (1.9) also implies that 
the distribution of the degree of a uniformly chosen vertex in [n] converges to a mixed Poisson distribution 
with mixing distribution F. This can be understood by noting that the weight of a uniformly chosen 
vertex is, by (1.8), close in distribution to F. In turn, when a vertex has weight w, then, by (1.1), its 
degree is close to Poisson with parameter w. Since a Poisson random variable with large parameter w 
is closely concentrated around its mean w, we see that the tail behavior of the degrees in our random 
graph is close to that of the distribution F. As a result, when (1.6) holds, and with D n the degree of a 
uniformly chosen vertex in [n], limsup,^^^ E[JD*] < oo when a < r — 1 and limsup^^,^ E[D^] = oo when 
a > r — 1. In particular, the degree of a uniformly chosen vertex in [n] has finite second, but infinite third 
moment when (1.6) holds with r G (3,4). 

We shall frequently make use of the fact that (1.6) implies that, as u 4- 0, 

[1 - F]- X (n) = {c F /u) l/{T - l \l + o(l)). (1.10) 

Under the key assumption in (1.6), we have that the third moment of the degrees tends to infinity, i.e., 
with W ~ F, we have E[W 3 ] = oo. Define 

v = E[W 2 ]/K[W], (1.11) 

so that, again by (1.6), v < oo. Then, by [11, Theorem 3.1] (see also [11, Section 16.4] for a detailed 
discussion on rank-1 inhomogeneous random graphs, of which our random graph is an example), when 
v > 1, there is one giant component of size proportional to n, while all other components are of smaller size 
o(n), and when v < 1, the largest connected component contains a proportion of vertices that converges 
to zero in probability. Thus, the critical value of the model is v = 1. The main aim of this paper is to 
investigate what happens close to the critical point, i.e., when v = 1. 
A simple example arises when we take 



for x < a, 
(a/x) r_1 for x > a, 
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in which case [1 — F] 1 (u) = a(l/u) 1 ^ T ^ , so that Wj = a(n/j) 1 ^ T ^ and 



E[W] = ^—il, E[W 2 ] = q2(T o 1} . (1.13) 
r — 2 r — 3 

The critical case thus arises when 

v = E[W 2 ]/E[W] = a(T ~5 = 1, (1.14) 
r — 3 

i.e., when a = (r — 3)/(r — 2). 

With the definition of the weights in (1.5), we shall write t/°(u>) for the graph constructed with the 
probabilities in (1.1), while, for any fixed A G 1, we shall write Q^(w) when we use the weight sequence 

w(X) = (1 + Xn-^-^-V)™. (1.15) 

We shall assume that n is so large that 1 + An~( r ~ 3 " ( r_1 ) > 0, so that uij(A) > for all i € [n]. This 
setting has first been studied in [24], where, for the largest connected component C max and each A G R, it 
is proved that both n.~( T ~ 2 )/(' r ~ 1 )|C inax | and ra( T ~ 2 )/( r_1 )/|C max | are tight sequences of random variables. 
In this paper, we bring the discussion of the critical behavior of such inhomogeneous random graphs 
substantially further, by identifying the scaling limit of (n - ^ -2 ^ 7 " -1 ) |C(j)|)j>i, where (C w )j>i denote the 
connected components ordered in size, i.e., |C max | = |C(i)| > |C(2)| > • • •• 

Interestingly, as proved in [8, 24, 35], when r > 4, so that EfVF 3 ] < oo, the scaling limit of the 
random graphs studied here are (apart from a trivial scaling constant) equal to the scaling limit of the 
ordered connected components in the Erdos-Renyi random graph, as first identified by Aldous in [2]. This 
suggests that the high-weight vertices play a crucial role in our setting, a fact that shall feature extensively 
throughout our proof. The importance of the high-weight vertices also partly explains why we restrict 
our setting to (1.5) and (1.6), which give us sharp asymptotics of the weights of the high-weight vertices 
in the heavy-tailed setting we study here. We shall comment on extensions of our results in more detail 
in Section 1.5 below. 

Before stating our main results, we introduce some notation. For a vertex i £ [n], we write C(i) for 
the vertices in the connected component or cluster of i. Further, let 

if i<j Vj G C(i), 

C <W = S „ . (1-16) 

otherwise. 




Then, clearly, |C max | = maxj e[n ] \C(i)\ = max ig [ n ] |C<(i)|, and (|C w |)j>i is equal to the sequence (|C<(s)l)i>l 
ordered in size. We further define the cluster weight of vertex i to be 

W(i) = "i, (1-17) 

and let W<(i) be as in (1.17), where the sum is restricted to C<(i). We again let (|W (i )|)j>i be equal to 
the sequence (| W< (*) |)i>l ordered in size. 

Throughout this paper, we shall make use of the following standard notation. We let — > denote 
convergence in distribution, and — > convergence in probability. For a sequence of random variables 
(X n ) n >i, we write X n = O v {b n ) when \X n \/b n is a tight sequence of random variables as n — > oo, 
and X n = o r (b n ) when \X n \/b n — > as n — > oo. For a non-negative function n \— > g(n), we write 
f{n) = 0(g(n)) when \f(n)\/g(n) is uniformly bounded, and f(n) = o(g(n)) when lirm^oo f(n)/g(n) = 0. 
Furthermore, we write f(n) = Q(g(n)) if f(n) = 0(g(n)) and g(n) = 0(f(n)). Finally, we write that a 
sequence of events (E n ) n >i occurs with high probability (whp) when F(E n ) — > 1. 

Now we are ready to state our main results. We start in Section 1.2 by describing the scaling limit of 
the ordered clusters, and in Section 1.3 we discuss further properties of the scaling limit. 
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1.2 The scaling limit for r € (3,4) 

In this section, we investigate the scaling limit of the connected components ordered in size. Our first 
main result is as follows: 

Theorem 1.1 (Weak convergence of the ordered critical clusters for r € (3,4)). Fix the Norros-Reittu 
random graph with weights w(X) defined in (1.15). Assume that u = 1 and that (1.6) holds. Then, for all 

( n -(-2)/(-i)| C(i) |).^ (7 i (A))i>i, (1.18) 
in the product topology, for some non- degenerate limit (7i(A))j>i. 

We next study the joint convergence of the clusters for different values of A G R. By increasing A, 
more and more edges are being formed. These extra edges potentially create connections between disjoint 
clusters, thus merging them. As a result, we can interpret A as a time variable, and as time increases, 
clusters are being merged. This resembles a coalescence process, as studied in [7]. We now make this 
connection precise. Before being able to do so, we introduce some necessary notation. 

We first give a quick overview of Aldous' standard multiplicative coalescent and how it relates to the 
limiting random variables in Theorem 1.1, seen as functions of the parameter A. It shall not be possible to 
give a full description of the process and its many fascinating properties here and we refer the interested 
reader to the paper [4], the survey paper [3] and the book [7]. 

Write for the metric space of infinite real- valued sequences x = (x\, xi, . . .) with x\ > X2 > • • • > 
and YliLi x i < °°) with the I 2 — norm as the metric. The standard multiplicative coalescent is described 
as the Markov process with states in i 2 ^ whose dynamics is as follows: for each pair of clusters (x,y), 
the pair merges at rate xy. Thus, the multiplicative coalescent is a continuous-time Markov process of 
the masses of an infinite number of particles, where two particles merge at a rate equal to the product of 
their mass. 

In [2], Aldous showed that there is a Feller process on the space defined for all times — oo < t < oo 
starting from infinitesimally small masses at time — oo, and following the above merging dynamics. The 
distribution of the coalescent process at any time t is the same as the limiting cluster sizes of an Erdos- 
Renyi random graph with edge probabilities p n = (1 + tn~ 1 ^ 3 )/n. 

Aldous and Limic [4] explicitly characterize the entrance boundary at — oo of the above Markov process, 
in the sense that they prove that every extreme version of the above Markov proces is characterized by a 
diffusion parameter k, a translation parameter /3, and a vector of 'limiting largest weights' c = (ci, C2, . . .) 
that describe the asymptotic decay of the masses of the particles at time — oo. We refer the interested 
reader to [4] for the full description of this process. In this terminology, the multiplicative coalescent can 
be described as the ordered lengths of excursions beyond past minima of the process 

W K '^ c {s) = kV 2 W{s) + f3s - \s 2 + V c {s), (1.19) 

where (W(s)) s >o is a standard Brownian motion, while 

oo 

V c (s) = Y,Cj(hE j <s}-c j s), (1.20) 

j=l 

with (Ej)j>i independent exponential random variables, Ej having mean 1/cj. Then, the (k,/3,c)- 
multiplicative coalescent is the set of ordered lengths of excursions from zero of the reflected process 

B K 'P' c (s) = W K >P' c (s) - min W K '^ c {s'). (1.21) 

0<s'<s 

Part of the proof of [4] is the fact that these ordered excursions can be defined properly. 
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The following theorem draws a connection between the components of the graph for a fixed A and the 
sizes of clusters at the same time in a multiplicative coalescent with a particular entrance boundary, scale 
and translation parameter. For this, define the sequence 

c= ( c i-V(r-l))^ 1) with c=c l/(r-l)_ (L22) 

Then, we have the following theorem: 

Theorem 1.2 (Relation to multiplicative coalescents). Assume that the conditions in Theorem 1.1 hold. 
Consider the sequence-valued random variables X*(\) = (ji(K[W]X), 72(E[W]A), . . .) obtained in Theo- 
rem 1.1. Then X*(\) has the same distribution as a multiplicative coalescent at time A with entrance 
boundary c/E[W], diffusion constant k = and centering constant (3 = — £/E[W], where ( is identified 
explicitly in (2.18) below. More precisely, there exists a simultaneous coupling of the clusters (|C w (A)|) i>i; 
where |C (i )(A)| is the i th largest cluster when the weights are equal to w(\), such that, for every vector 
(Ai, A2, • • • , Afe), 

(n -(.-2)/(.-i) |C(i)(A0|) fc =i _^ ( X * {Xl /E[W}))l v (1.23) 

In particular, 

|A|7j(A) — > Cj as A — > —00 for each j > 1. (1-24) 

Theorem 1.2 proves that the finite-dimensional distributions of the rescaled cluster sizes converge to 
those of a multiplicative coalescent. While we believe that also process convergence holds, viewing the 
processes as elements of an appropriate function space, we have no proof for this fact. See Section 7 
for a full proof of this result. The setting in this paper is the first example where the multiplicative 
coalescent with k = arises in random graph theory. Indeed, all random graph examples in [4] have 
largest component sizes of the order n 2 / 3 , like for the Erdos-Renyi random graph studied in [2]. Our 
example links the multiplicative coalescent also to random graphs with the largest critical connected 
components of the order nS T ~ 2 " ( T_1 ) instead of n 2 / 3 . 

A crucial part of the proof of Theorem 1.2 is the analysis of the subcritical phase of our model. The 
asymptotics of the rescaled ordered cluster sizes in the subcritical regime acts as the entrance boundary 
of the multiplicative coalescent, as explained in more detail in [4, Proposition 7]. This entrance boundary 
is identified in the following theorem, which is of independent interest. In the statement of Theorem 1.3, 
the lower bound on A n appears only to ensure that u)j(A) = (1 + A n > for every i E [n]. 

Theorem 1.3 (Subcritical phase). Assume that the conditions in Theorem 1.1 hold, but now take A = 
A n — > —00 as n — > 00 such that A n > — n~( r ~ 3 )/( r ~ 1 ' 1 . Then, for each j £ N, 

\\ n \n-^-mr-D lC(j)l ^ Cj . (1 . 25 ) 

Theorem 1.3 is proved in Section 6. Interestingly, the limit in (1.25) is deterministic (recall also 
(1.22)). The rough idea for this is as follows. As A = A n — > —00, the random graph becomes more and 
more subcritical. Now, if we look at C(j), the cluster of vertex j, then we can view it as the union of 
approximately Wj (which is roughly the degree of vertex j) almost independent clusters. These clusters 
are close to total progenies of branching processes having mean offspring f n (A n ) ~ 1 + A n n~( T ~ 3 )/( T_1 ). 
The expected total progeny of a branching process with mean offspring v equals 1/(1 — v). As a result, 
the expected cluster size of vertex j is close to 

Wi Wj n (r-2)/(r-l) 

T^k) - |A„|n-.A/. T - 1 , - - 1 jg-^(l + ■(!))■ 

In our setting, Cj = 1 ^j _1 /( T_1 ) , so that j 1— > Cj is strictly decreasing. Thus, we must also have 

that \C(J)\ = |Cy)| whp. The proof of Theorem 1.3 makes this argument precise, by investigating the 
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deviation from a branching process, a technique that is also crucially used in [24] to study tightness of 
the sequence of random variables \C ma , x \n^^ T ^^ T ^ 1 ^ • A result similar to Theorem 1.3 is proved for the 
near-critical phase of the configuration model in [25], but the proof we give here is entirely different. 

We also obtain that the ordered cluster weights as defined in (1.17) satisfy the same scaling results as 
described above: 

Theorem 1.4 (Scaling limit of cluster weights). Theorems 1.1, 1.2 and 1.3 also hold for the ordered 
cluster weights (n~* r ~ 2 )/( T ~ 1 )W ; ( i )) i>1 , with identical scaling limits as in Theorems 1.1, 1.2 and 1.3. 

As explained in more detail in Section 2.1 below, Theorem 1.4 can be heuristically understood by 
noting that the average weight of a vertex in a cluster is close to v = 1, and therefore it contributes 
the same to the weight of the cluster as it does to the cluster size. In fact, the proof will show that 
n~( r ~ 2 " ( T_1 ) W(i) and n~( T ~ ^ r_1 )|C( i ) | converge to the same limit. The proof of Theorem 1.4 shall be 
given simultaneously with the proofs of Theorems 1.1, 1.2 and 1.3, respectively, adapted so as to deal 
with cluster weights or cluster sizes. Sometimes, it is more convenient to study cluster sizes (for example, 
since cluster explorations can more naturally be formulated in terms of the number of vertices than their 
weight), in some cases it is more convenient to work with cluster weights (for example, since the cluster 
weights can be described in terms of multiplicative coalescents, as fact crucial in the proof of Theorem 
1.2). 

1.3 Properties of large critical clusters 

We shall also derive some related interesting properties of the limiting largest clusters. In the following 
theorem, we consider the connectivity structure of the high-weight vertices: 

Theorem 1.5 (Connectivity of high- weight vertices). Under the assumptions in Theorem 1.1, for every 
i,j >1 fixed, 

lim F(j GC(i)) =%(A) G (0,1), (1.27) 



and 



lim F(i G C max ) = qi (X) G (0, 1). (1.28) 

71— ><X> 



Theorem 1.5 states that the high-weight vertices play an essential role in the critical regime. Indeed, we 
shall see that in the subcritical regime, with high probability, C max = C(l), so that P(l G C max ) = 1 — o(l), 
while F(i G Cmax) — °(1) for i > 1- In the supercritical regime, instead, F(i G C max ) — 1 — o(l) for every 
i > 1 fixed. Thus, the critical regime is precisely the regime where the high-weight vertices start to form 
connections. Informally, this can be phrased as 'power to the wealthy'. Theorem 1.5 should be contrasted 
with the situation when EfVF 3 ] < oo studied in [8, 35], where the probability that any specific vertex is 
an element of C max is negligible, and, instead, the largest cluster is born out of many trials each having a 
small probability. This can be informally phrased as 'power to the masses'. 

The following theorem, which is a crucial ingredient in the proof of Theorem 1.1, essentially says that, 
for each fixed A, the maximal size components are those attached to the largest weight vertices: 

Theorem 1.6 (Large clusters contain high-weight vertex). Assume that the conditions in Theorem 1.1 
hold. Then 

(a) for any e G (0, 1), there exists a K = K(e) > 1, such that, for all n, 

P(max|C<(i)| > en^-^ 1 ^-^) < e; (1.29) 

(b) for any m > 1, 

limlimsupPf (|C<(z)|)j e r n icontains at least m components of size > en^ 7--2 ^ 1--1 )) = 1. (1.30) 
e4-0 rwoo \ / 
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1.4 Overview of the proofs 

In this section, we give an overview of the proofs of our main results. We start by explaining the proof 
of Theorem 1.1, along the way also explaining the key ideas behind Theorems 1.5 and 1.6. After this, we 
shall discuss the proofs of Theorem 1.2 and 1.3. 

We note that, since u i— > [1 — i ? ] _1 (u) is non-increasing, w is ordered in size, i.e., w\ > W2 > 1^3 > ■ ■ ■ • 
We start by exploring the clusters from the largest weight vertices onwards. Here, by a cluster exploration, 
we mean the recursive investigation of the neighbors of the vertices already found to be in the cluster. 
This cluster exploration shall be described in detail in Section 2.1. The rough idea is as follows. We start 
with a vertex i, and wish to find all the vertices that are in its cluster. For this, we iteratively take a 
vertex found to be in the cluster of which we have not yet inspected its direct neighbors, and check which 
vertices that are not yet found to be in the cluster are neighbors of it. Call a vertex active when it is 
found to be in the cluster, but has not yet been explored. A vertex is called explored when its neighbors 
have been investigated and neutral when it has not yet appeared in the exploration process. Then, in 
the exploration process at time t, we take a vertex, turn it from active to explored, and explore it, i.e., 
see which neutral neighbors it has. Turn the status of its neutral neighbors to active. Let Z\ denote the 
number of active neighbors after the exploration of the I th active vertex. When Z\ = for the first time, 
then there are no more active vertices, so all elements of the cluster have been found. (The description in 
Section 2.1 is slightly different than the one described here, as it studies potential elements of the cluster 
instead.) 

We note that the high- weight vertices have weights of the order wj ~ (c F n/j) ^ , so, when we start 
with a high- weight vertex, initially, the number of active vertices shall be of the order n 1 / <yT ^ l \ When our 
exploration process hits another high-weight vertex, then the number of active vertices gets a large push 
of the order n 1 /^" 1 ) upwards. It is these upward pushes that change the number of active vertices in a 
substantial way, and, therefore, the high-weight vertices play a crucial role in the critical behavior of our 
random graph. In turn, this suggests that the largest clusters contain at least one high-weight vertex, as 
indicated by Theorem 1.6. Due to the critical nature of our random graph, it turns out that the average 
number and weight of active vertices being added in each exploration is close to 1, so that, due to the 
removal of the vertex which is being explored, the exploration process has increments that have mean 
close to zero. 

In Section 2, we shall start by identifying the scaling limit of n~( r-2 )/( r-1 )|C<(l)| = n~( r_2 )/( r_1 ) |C(1)|. 
The weak limit of n _ ( T-2 )/( r_1 )|C(l)| is given in terms of the hitting time of of an exploration process 
exploring the cluster of vertex 1 (the vertex with the highest weight). See Theorems 2.1 and 2.4. The 
scaling limit of the exploration process of a cluster exists (see Theorem 2.4), and can be viewed as a 
'thinned' Levy process. Therefore, the convergence in distribution of n"^ 2 )/^ 1 ) |C(1)| as in Theorem 
2.1 is equivalent to the convergence of the first hitting time of zero of the exploration process to the one 
of this thinned Levy process. In proving this, we shall employ a careful analysis of hitting times of a 
spectrally positive Levy process that stochastically dominates the thinned Levy process. 

Following the proof of convergence of n _ ( r_2 )/( T_1 )|C(l)| in Theorem 2.1, we shall prove the conver- 
gence in distribution of (h - ^" -2 '/^" -1 ' |C<(i)|)ig[n] in Theorem 4.1. This proof makes crucial use of the 
estimates in the proof of Theorem 2.1, and allows us to extend the result in Theorem 2.1 to the (joint) 
convergence of several rescaled clusters by an inductive argument. By Theorem 1.6, with high probability, 
the largest m clusters are given by the largest m elements of the vector (|C<(z)|)j g [ n ], so that this completes 
the proof of Theorem 1.1. The conclusion of this argument shall be carried out in Section 5 below. 

In Section 6, we prove Theorem 1.3 by a second moment argument, using the fact that the subcritical 
phase of our random graph is closely related to (and even stochastically dominated by) a branching process. 
In Section 7, we use the results proved in Section 6, jointly with the results in [4], to prove Theorem 1.2. 
We now discuss in a bit more detail how one can understand the appearance of multiplicative coalescents 
in the random graphs we study here. 

We make crucial use of [4, Proposition 7], whose application we now explain. Fix 
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For each fixed t, consider the construction of the inhomogeneous random graph as in (1.1) but with the 
weight sequence w{t) = (wj(t))i<j< n given by 

%•(*) = ^(1 + (t + X n )£ n n- 2 ^- 2 y^). (1.31) 

Let 

^ ( " ) W = K (T - 2)/(T - 1) |W (l) |) j > 1 (1.32) 

denote the ordered version of component weights. Note that the above process, when taking t = —\ n + 
A/E[W], is closely related to the ordered clusters of our random graph with weights Wj(X) = Wj(l + 
since £ n — E[P^]?i(l -1- o(l)). We then note that X^ can be constructed so that, viewed 
as a function in t, it is a multiplicative coalescent: 

Lemma 1.7 (Discrete multiplicative coalescent). We can construct the process X^ n) = (X <n \t))t>o such 
that, for each fixed t, X (n) (i) has the distribution of the ordered rescaled weighted component sizes of the 
random graph with weight sequence given by (1.31) and such that, for each fixed n, the process viewed as 
a process in t is a multiplicative coalescent. The initial state denoted by x^(0) has the same distribution 
as the ordered weighted component sizes of a random graph with edge probabilities as in (1.1) and weight 
sequence 

» j (0)=w j (1 + AX^ 2M)/(t " 1) ). (1.33) 

Proof. For each unordered pair (i,j), let £y be an exponential random variable with rate WiWj/£ n , where 
are independent. For fixed t, define the graph Q l n to consist of all edges for which 

s (i + ^fer) • 

Then, by construction, for all t > 0, the rescaled weighted component sizes of Q^(w) have the same 
distribution as X^'it). Further, for any time t we note that two distinct clusters C\ and C2 having 
weights W w (i) and Wyj(t), respectively, coalesce at rate 

l n n^l^) £ ^1 = (n-(-*)/fr-i>W (i) (t)) (n-M/H^ft)), (1.35) 

SlGCl,S2£C2 

as required. □ 

In effect, Theorems 1.1 and 1.2 give us two distinct proofs of the statement that the ordered cluster 
weights converge, and we now discuss the advantages of these two different proofs. Theorem 1.1 proves 
that for any fixed A, (^-~'- r ~ 2 ^ ( ' T ~ 1 ^|W(i)|) i>1 converges in distribution. Further, by the fact that this 
vector is obtained by sequentially investigating the clusters of the high-weight vertices, it allows us to 
prove properties about the high-weight vertices that are part of the largest clusters, as in Theorems 1.5- 
1.6. Finally, it allows us to show that the ordered cluster sizes have the same scaling limit as the ordered 
cluster weights (see Theorem 1.4), a feature that is also crucial in the proofs of Theorems 1.2 and 1.3. 

Theorem 1.2, instead, shows that the process of the ordered cluster sizes or weights converges in 
distribution. This means that there exists a stochastic process that describes the joint convergence of the 
ordered cluster sizes or weights for different values of A simultaneously. Due to the fact that the proof of 
Theorem 1.2 relies on [4, Proposition 7], however, we obtain less information about the vertices that are 
part of the large critical clusters. The combination of the two proofs provides us with a detailed and full 
understanding of the scaling limit of the ordered cluster sizes or weights. 



1.5 Discussion 

Comparison to the case of weights with finite third moments. In [2, 8, 35], the scaling limit 
was considered when E[I^ 3 ] < 00. In this case, the scaling limit turns out to be (a trivial rescaling of) 
the scaling limit for the Erdos-Renyi random graph as identified by Aldous in [2]. Thus, the setting for 
r 6 (3,4) is fundamentally different. When E[VF 3 ] < 00, the probability that 1 6 C max is negligible, while 
in our setting this is not true, as shown in Theorem 1.5. 
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Other weights. Our proof reveals that the precise limits of Wjrt - ( r-1 ', for fixed i > 1, arise in the 
scaling limit. We make crucial use of the fact that, by (1-10) c, = lim n _ 5 . 00 Win~ 1 /( T ~ 1 ) = (c F /i) 1 ^ T ^ 1 \ 
However, we believe that also when \mi n ^ 00 Win^ 1 ^ T ~ l " > exists for every i > 1 and is asymptotically equal 
to ai -1 ^ 7 " -1 ) for some a > 0, our results remain valid. This suggests that, by varying the precise values 
of high weights, there are many possible scaling limits to be found. It would be of interest to investigate 
this further. 

Also, we restrict to 1 — F(x) that are, for large x > 0, asymptotic to an inverse power of x (see (1-6)). 
It would be of interest to investigate the scaling behavior when (1.6) is replaced with the assumption that 
1 — F(x) is regularly varying with exponent 1 — r, i.e., [1 — F](x) = x~^ T ~^£(x) for some x i— > £(x) which 
is slowly varying at oo. In this case, we believe that the asymptotic sizes of the largest critical clusters 
are given by £*(n)n <yT ~ 2 ^ <yT ~ l ' > for some suitable slowly varying function n \— > £*(n) that can be described 
in terms of x i— > £(x). For more details, see [24, Section 1.3], where also the critical cases r = 3 and r = 4 
are discussed. 

I.i.d. weights. In our analysis, we make crucial use of the choice for Wi in (1.5). In the literature, 
also the setting where (Wj)i e r n i are independent and identically distributed (i.i.d.) random variables with 
distribution function F has been considered. We expect the behavior in this model to be different. Indeed, 
let Wi = W^, where are the order statistics of the i.i.d. sequence iWi) ie y n y It is well known that 

n-VCr-D^ = (#! + ••• + Ei)- 1 /^, (1.36) 

where {Ef)^_^ are i.i.d. exponential random variables with mean 1. In particular, when r 6 (3,4), 
E[£*] < oo whenever a < r — 1. The extra randomness of the order statistics has an effect on the scaling 
limit, which is thus different. In most cases, the two settings have the same behavior (see, for example, 
[8], where this is shown to hold for weights for which E[VF 3 ] < oo, where W has distribution function F). 

High-weight vertices. The fact that the vertex i is in the largest connected component with non- 
vanishing probability as n — > oo (see Theorem 1.5) is remarkable and invites some further discussion. In 
our setting, a uniformly chosen vertex in [n] is an element of C max with negligible probability. The point 
is that vertex i has weight Wj, which, for i fixed, is close to (c^/z) 1 ^ 1 " 1 ^n 1 ^ T ^ 1 \ while a uniformly chosen 
vertex has a bounded weight. Thus, Theorem 1.5 can be interpreted as saying that the highest-weight 
vertices characterize the largest components. In the subcritical case (see e.g. the results by Janson in [27] 
or Theorem 1.3), the largest connected component is the one of the vertex with the highest weight, and 
the critical situation arises when the highest-weight vertices start connecting to each other. 

Connection to the multiplicative coalescent. The mental picture associated with the entrance 
boundary of the coalescent here seems to be different from [4], where in spirit many of the component 
sizes are of order n 2 / 3 . Here the entrance boundary describes the sizes of the maximal components rescaled 
by n"^- 2 )/^- 1 ) in the A — > — oo regime, whilst in [4] they arise as limits of random graphs similar to 
critical Erdos-Renyi random graphs, where, in addition to the random edges, there are initially a number 
of large "planted" components of sizes |_Cjn 1//3 J , see [4, Section 1.3]. However, the results of [4] are crucial 
in identifying the distribution of the limiting component sizes for fixed A. It would be interesting to see 
if the stochastic calculus techniques developed in [4] can be further modified to give useful information 
about the surplus of edges in the maximal components (the surplus of a component C with E(C) edges 
and V(C) vertices is equal to E(C) — (V(C) — 1) and denotes the minimal number of edges that must be 
removed from the component to make it a tree). 
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2 The scaling limit of the cluster of vertex 1 



In this section, we identify the scaling limit of |C(1)|. We note from (1.5) that the weight of vertex 1 
is maximal, i.e., w\ > u>2 > ■ ■ ■ > w n . When r > 4, the probability that vertex 1 belongs to C max is 
negligible. When r G (3,4), instead, we shall see that vertex 1 is in C max with positive probability, so that 
it is quite reasonable to start exploring the cluster of vertex 1 first, since |C(1)| stochastically dominates 
\C(j)\ for all j G [re]. Theorem 2.1 below states that |C(1)| is of order ret 7 " -2 ^- 1 ). By [24, Theorem 1.2], 
the same is valid for |C max |, which confirms the above heuristic. 

Theorem 2.1 (Weak convergence of the cluster of vertex 1 for r G (3,4)). Fix the Norros-Reittu random 
graph with weights w(\) defined in (1.15). Assume that u = 1 and that (1.6) holds. Then, for all A G M., 

n -(r-*)/{T-i)\ C (i)\ H x (0), (2.1) 

for some non- degenerate limit H\(0). 

Theorem 2.1 is proved in Section 3.2. We now start by discussing cluster explorations and their 
relation to branching processes, which play an essential role in our proofs. 

2.1 Cluster explorations and their relation to branching processes 

We fix the weight sequence to be w(\) defined in (1.15), and we shall denote the weight of vertex i (or 
the z th coordinate of to (A)) by Wi(X). 

In order to prove Theorem 2.1, we make heavy use of the cluster exploration, which is described in 
detail in [33] and [24]. The model in [33] is a random multigraph, i.e., a random graph potentially having 
self-loops and multiple edges. Indeed, for each i,j G [re], we let the number of edges between vertex i and 
j be ~Poi(wi(\)wj(\)/£ n (\)), where, for \i > 0, we let Poi(//) denote a Poisson random variable with mean 
fj,, and we define 

4(A) = Y, w iW = 4(1 + An-C T - 3 >/< T - 1 >). (2.2) 
ie[n] 

The number of edges between different pairs of vertices are independent. To retrieve our random graph 
model, we merge multiple edges and erase self-loops. Then, the probability that an edge exists between 
two vertices i,j G [re] is equal to 

Pij = P(Poi(«; i (A)«J i (A)/4(A)) > 1) = 1 - e-^W^W/^W, (2.3) 

as required. Further, the number of potential edges from a vertex i has a Poisson distribution with mean 
Wi(\). We shall work with the above Poisson random graph instead, and we shall refer to the Poisson 
random variable Poi(tUj(A)) as the number of potential neighbors of vertex i. When we find what the 
vertices are that correspond to these Poi(u;j(A)) potential neighbors, i.e., when we determine their marks, 
then we can see how many real neighbors there are. Here by a 'mark' we mean a random variable M 
with distribution 

¥(M = m) =w m (X)/£ n (X) = w m /£ n , 1 < m < re. (2.4) 

The variable M corresponds to the actual vertex label associated to the potential vertex. A potential 
vertex arising in our exploration process is an actual vertex when its mark has not arisen in the exploration 
up to that point. We now describe this cluster exploration in detail. 

We denote by (Zi)i>q the exploration process in the breadth-first search, where Zq = 1 and where Z\ 
denotes the number of potential neighbors of the initial vertex (which is in the case of Theorem 2.1 equal 
to vertex 1). The variable Z\ has the interpretation of the number of potential neighbors of the first I 
explored potential vertices in the cluster whose neighbors have not yet been explored. As a result, we 
explore by taking one vertex of the 'stack' of size Z\, drawing its mark and checking whether it is a real 
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vertex, followed by drawing its number of potential neighbors. Thus, we set Zq = 1,Z\ = Poi(wi(A)), 
and note that, for / > 2, Z\ satisfies the recursion relation 

Z, = Z,_i+X,-l, (2.5) 

where X[ denotes the number of potential neighbors of the I th potential vertex that is explored. More 
precisely, when we explore a potential vertex, we start by drawing its mark in an i.i.d. way with distribution 
(2.4). When we have already explored a vertex with the same mark as the one drawn, we turn the status 
of the vertex to be explored to inactive, the potential vertex does not become a real vertex, and proceed 
with the next potential vertex. When, instead, it receives a mark which we have not yet seen, then the 
potential vertex becomes a real vertex, its mark Mi 6 [n] indicating to which vertex in [n] the I th explored 
vertex corresponds, so that Mi G C(l). We then draw X\ = Poi(^A/J, and X\ denotes the number of 
potential vertices incident to the real vertex M\. Again, upon exploration, these potential vertices might 
become real vertices, and this occurs precisely when their mark corresponds to a vertex in [n] that has not 
appeared in the cluster exploration so far. We call the above procedure of drawing a mark for a potential 
vertex to investigate whether it corresponds to a real vertex a vertex check. 

In [33, Proposition 3.1] (see also [24, Section 3.2, in particular, Proposition 3.4]), the cluster exploration 
was described in terms of a thinned marked mixed Poisson branching process. This description implies 
that the distribution of X\ (for 2 < / < n) is equal to Poi(wMi (A)) Ji, where (a) the marks (M;)^ 2 are 
i.i.d. random variables with distribution (2.4); and (b) Ji = l{Af ; 0{i}u{M 2 ,...,Af i _ 1 }} is the indicator that 
the mark Mi has not been found before and is unequal to 1. Here, the mark M; is the label of the potential 
element of the cluster that we are exploring, and, clearly, if a vertex has already been observed to be part 
of C(l) and its neighbors have been explored, then we should not do so again. We conclude that we arrive 
at, for I > 2, 

Zi = Zi_i + Xi - 1, where Xi = Poi(w Ml (X))Ji and J z = 1{m^{i}u{m 2 ,...,Mi-i}}- ( 2 - 6 ) 

Then, the number of vertex checks that have been performed when exploring the cluster of vertex 1 equals 
y(l), which is given by 

V(1) = M{1: Zt = 0}, (2.7) 

since the first time at which there are no more potential vertices to be checked, all vertices in the cluster 
have been checked. 

Further, the number of real vertices found to be part of C(l) after / vertex checks equals 

I 

\C(l;l)\ = l + Y,Jj, (2-8) 

i=2 

i.e., all the potential vertices, except for those that have a mark that has appeared previously. Therefore, 
we conclude that 

V(i) V(i) 
\C(1)\ = |C<(1)| = 1 + £ Jj = V(l) - £(1 - Jj). (2.9) 

3=2 3=2 

It turns out that the second contribution is an error term (see Lemma 3.6 below), so that the cluster size 
of 1 asymptotically corresponds to the first hitting time of of I H > Z\. 
Throughout the paper, we abbreviate 

a = l/(r-l), p = (t - 2)/(r - 1), r, = (r - 3)/(r - 1). (2.10) 
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2.2 Branching process computations 

In this section, we discuss some useful facts about branching processes. Note that if, in the recursion 
arising in the exploration of the cluster in (2.6), we ignore the J;'s (i.e., we ignore the effect of marks that 
have already been used), we arrive at the recursion 

Z (BP) = Z (BP) + X (BP, _ ^ (2 n) 

where now 

X( BP) =Poi(^(A)), (2.12) 

and where (Poi(wMi(^)))i>2 are i-i-d. random variables, while M\ = i. This recursion is the random walk 
description in the exploration of the total progeny of a branching process. Indeed, let 

T{i) = inf{Z : Z/ BP) = 0} (2.13) 

be the first hitting time of of the process (Z^ bp) );>q. Then, by the random walk description of a branching 
process (see e.g., [23, Section 3.3]), T(i) has the same distribution as the total progeny of a branching 
process in which the root has offspring distribution Poi(u^(A)), while the offspring of all other individuals 
is i.i.d. with mixed Poisson offspring distribution Poi(u>A/(A)), where M is the mark distribution in (2.4). 
In the setting in Section 2.1, we have i = Mi = 1, so that we start from the root having mark 1, but in 
this section, we shall generalize as well to Mi = i, where i € [n] is general. Further, we shall also denote 
the total progeny of the branching process with offspring distribution Poi(wm(^)) by T. In this section, 
we investigate properties of such branching processes. 

The connection to branching processes (in particular, the stochastic domination of the cluster sizes by 
branching processes due to (2.6)) plays a crucial role in [24], where this comparison was used in order to 
prove that n _p |C max | and n p /|C max | are tight sequences of random variables. There, only bounds on the 
maximal cluster size were shown, while, in this paper, we identify the scaling limit of all large clusters. 

The difference between the branching process recursion relation in (2.11) and (2.12), and the corre- 
sponding one for the cluster exploration in (2.6) resides in the random variables {Ji)i>\- Indeed, when 
Ji = 0, then X[ = Poi(u>M ; (A)) J\ = 0, while -X~ ; <BP) = Poi(iOAfj(A)) is unaffected. Therefore, we can think 
of this procedure as a thinning of our branching process. Indeed, when the mark of the I th potential ver- 
tex has been seen before, then, in the cluster exploration, we remove this vertex and all of its offspring. 
Thus, the recursions in (2.11)-(2.12) and (2.6) give us a simultaneous coupling of the cluster exploration 
processes and the branching process such that any deviation between the two arises from the thinning of 
the potential vertices and the subsequent removal of the branching process tree that is attached to the 
thinned potential vertices. This description shall prove to be crucial in the comparison of cluster sizes 
and branching process total progenies used in the proofs of Theorems 1.2 and 1.3. 

We continue to investigate the critical behavior of the branching processes at hand. We denote 

= T E WW' ( 2 - 14 ) 

and we write v n = v n (0). Then, we note that 

i/„(A)=E[Poi(«; J |f(A))] J (2.15) 

so that z^n(A) is the mean offspring of the branching process and v n (\) — > 1 corresponds to our branching 
process being critical. Further, 

E[Poi(wAf(A))(Poi(«>Af(A)) - 1)] = E[w 2 M (X)] = yY, w i w i( A ) 2 ~> °°> ( 2 - 16 ) 

n -,-r l 
jG[n] 

so that our branching process has asymptotically infinite variance in the setting in (1.6). We now give 
detailed asymptotics for the mean v n = f n (0) of the above branching process. From this asymptotics, we 
can easily deduce the asymptotics of v n {X) = f n (l + Xn~ v ) (recall (2.10), (2.14) and (1.15)). 
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Lemma 2.2 (Sharp asymptotics of v n ). Let the distribution function F satisfy (1.6), and let u n = f ra (0) 
be given by (2.14) and v by (1.11). Then, 



where 



c 



2/(7-1) _oo 



E[W] 



^ _ pi 



u~ 2a du - r 2a 



e (-00,0). 



(2.17) 



(2.18) 



Proof By [24, Corollary 3.2], t n = YlieWi] Wi = n ^[^] + 0(n a ), where it is also proved that v n — v = 
0(n -7? ). The sharper asymptotics for v n in (2.17) is obtained by a more careful analysis of the arising 
sum. We note that, by the remark below (1.7), 



g[\ - F\-\u)Hu 



(2.19) 



By the asymptotics of l n above, we have that 



Hie[n] W i 



+ o{n- r >). 



nE[W] 

We shall make use of the fact that, when / is non- increasing, 

/(»)< f f(u)du<f(i-l). 

Ji-l 



(2.20) 



(2.21) 



Applying this to f(u) = [1 — F] 1 (u) 2 , which is non-increasing, we obtain in particular that, for any 
K > 1, 



i i 1 n r 1 

-± Y w 2 < / [l-F]-\ufdu. 

n J Kin 



[l-F\-'(uYdu--w' K/n < 

K/n n 



i=K+l 



Now, 



1 2 _ C F 

n A/ '" n 



"k/n = -WK) 2a (l + o(l)) = G(i^ 2a n^). 



Thus, we conclude that 



A" 



——- Y [i - i^fo) 2 ** - -r^Ywf + e(^- 2a n-'?) + ( 

Next, by (1.6), for every if > 1 fixed, 



K 



V - Vr, 



n 



(2.22) 



(2.23) 



(2.24) 



i^«; 2 = n-^(c F A) 2a + (n^), 



1 X 



i/n 



K 



[l-F}~ L (u) 2 du = n 



n S ■'(i-l)/" 
Combining these two estimates yields that 



T {c F /u) 2a du + o(n~^) 
i=i 



(2.25) 
(2.26) 



„2o. 



n ' Z/ — Un 



E[W] 



u~ 2a du - r 2a 



+ G(K~ 2a ) + o(l). 



(2.27) 
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Letting first n — > oo followed by K — > oo, we conclude that 

lim n v [u n - u] = (, (2.28) 

as required. The fact that £ > — oo follows from the fact that, for i > 2, 

o< f u - 2a du-r 2a <(i-iy 2a -r 2a , (2.29) 

which is a summable sequence. □ 
We conclude that, in the critical regime where v = 1, we have 

u n (\) = i + On-v + o{n- TI ), (2.30) 

where 9 = A + Q. The parameter 9 E R indicates the location inside the critical window formed by 
the weights w{\). Indeed, in the asymptotics for v n {\) in (2.30), the fact that 9 = C, + A arises from 
^n(A) = (1 + Xn^^Vn, together with the sharp asymptotics of v n in (2.17). The value of ( is constant 
and does not depend A, while the value of A indicates the location inside the scaling window, so we can, 
alternatively, measure the location inside the scaling window by 9 G M. 

We continue by computing first and second moments of total progenies and their weights, where, for 
our marked mixed Poisson branching processes, we define the weight of the branching process total progeny 
to be 



wt 



f> Mi , (2-31) 



l=i 



and similar for wj-^y Then we can compute the following moments, the proof of which is standard and 
shall be omitted: 

Lemma 2.3 (Branching process computations). 

(a) 

EP1 = T ±_, nT ^^ + w 2_^ ( , 32) 

3 1 \n\ 



0>) 

EM-^-, EKH^E^. (,33) 



(c) 



(d) 



3 

EK,,,] = T -i_, E1 „| (() ] = ( T -i_) + ^ £ ^. (2.35) 

je[n] 
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2.3 Scaling limit of the cluster exploration process 

Theorem 2.1 will follow from the fact that we can identify the scaling limit of the process (Zi)i>q. To 
identify this scaling limit, we let 

Z (n) = n-VfT-Qz^wnr-i) = n- a Z tnP , (2.36) 

where we recall the abbreviations in (2.10). By convention, for t > and for a discrete-time process 
(Si)i>o, we let S t = 5 L tj. 

The intuition behind (2.36) is as follows. First, since the largest connected components are of order 
n p as proved in [24, Theorem 1.2], and the successive elapsed times between hits of zero of the process 
(Z[)i>o correspond to the cluster sizes, the relevant time scale is tn p . Further, by Theorem 1.6, we see 
that the large clusters correspond to the clusters of the high-weight vertices. The maximal weight is of 
the order n a , so that this needs to be the relevant scale on which the process Z[ runs. The proof below 
makes this intuition precise. 

In order to define the scaling limit, we introduce a non- negative continuous-time process (St)t>o- For 
some a > 0, we let (Ii(t))'^ 1 denote independent increasing indicator processes defined by 

Ii{s) = l{Ex P (ai-")e[o,s]}> s > 0, (2.37) 

so that 

P(Xi(s) = Vs 6 [0, t]) = e~ ati ~ a . (2.38) 
We further let, for some b > and cGl, and a as in (2.37), 

oo 

S t = b - abt + ct + br a [li(t) - atr a ], (2.39) 

i=2 

for all t > 0. We call (St)t>o a thinned Levy process, a name we shall explain in more detail after the 
theorem. To make the dependence on (a, b, c) explicit, we now denote St = St(a, b, c). Then, we have the 
obvious scaling relation 

S t {a,b,c) = bS at {l,l,c/(ab)), (2.40) 

where 

oo 

S t (l, 1, p) = 1 + (/3 - l)t + J2 i~ a [Ut) ~ tr a ], Ii(t) = l {Exp (i-«) e [o >t ]}- (2-41) 

i=2 

The main result concerning the scaling limit of the exploration process is the following theorem: 
Theorem 2.4 (The scaling limit of Z{). As n — >■ oo, under the conditions of Theorem 1.1, 

(Z^)t> A (St) t > , (2.42) 

where a = c"/E[W], b = c", c = 9 — ab, in the sense of convergence in the J\-Skorokhod topology on the 
space of cddldg functions on R + . 

It is worthwhile to note that while the convergence in Theorem 2.4 only has implications for our 
random graph for t < Hi(0), which is the hitting time of zero of the process (St)t>o, the processes 
(Z^)t>o and (St)t>o are we U defined also for larger t, and convergence holds for all t. This is, in fact, 
useful in the proof. 

The proof of Theorem 2.4 shall be given in Section 3 below. We now first discuss the limiting process 
(<St)t>o and its connection to Levy processes. To do this, we denote by (JZt)t>o the process given by 

oo 

K t = b-abt + ct + J2 br a [Ni{t) - ati~ a }, (2.43) 

8=2 
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where (iVj) f >o are independent Poisson processes with rates ai~ a . Clearly, the process (TZt)t>o is a 
spectrally positive Levy process, i.e., (TZt)t>o has no negative jumps (see e.g., [6, 31] for more information 
on Levy processes), with exponent (for which E(e _1? ^' _ '' ? ' ^) = e - *^^) given by 



$(0) = ( c - ab)0 + ai ~ a 1 - e ~ m " ~ m ~ a ■ ( 2 - 44 ) 

i=2 

Alternatively, the exponent V'W can be expressed as 

/oo /»oo 
xU(dx) + J (1 -e'^ -$xl {x<iy )Il(dx), (2.45) 

where the Levy measure II is defined by 

oo 

n(dx) =Y j ai~ a 5 xM - a . (2.46) 



i=2 



Since II(— oo, 0) = 0, the Levy process is spectrally positive, so that the process (TZt)t>o has only positive 
jumps. Also, 11(6, oo) = 0, so that the jumps of (TZt)t>o are bounded by b. Further, 

poo /■oo 00 / h \ 3 °° 

J (1 Ax 2 )U(dx) < J x 2 U(dx) = ) =a6 3 ^r 3a <oo, (2.47) 



since r £ (3,4) so that 3a = 3/(r — 1) > 1. Therefore, the process (TZt)t>o is a well-defined Levy process. 
We may reformulate (2.39) as 

oo 

S t = b - abt + ct + J2 bi- a [l {m) > l} - ati-% (2.48) 

i=2 

so that the process (<St)t>o does not include multiple counts of the independent processes (Ni{t))t>o- This 
is the reason that we call the process (<St)t>o a thinned Levy process. In [4], this process is called a Levy 
process without repetitions. Naturally, we have that the descriptions in (2.43) and (2.48) satisfy that, a.s., 
for all t > 0, 

S t < ll t , (2.49) 

which allows us to make use of Levy process methodology in our proofs. We do note that IZt is a rather 
poor approximation for St, particularly on large time scales, because the thinning becomes more important 
as time progresses. 



3 Proof of Theorems 2.1 and 2.4 

In this section, we prove Theorems 2.1 and 2.4. We start by proving Theorem 2.4 in Section 3.1, and 
make use of Theorem 2.4 to prove Theorem 2.1 in Section 3.2. 



3.1 Proof of Theorem 2.4 

Instead of (Zi)i>q, it is convenient to work with a related process (Sz)z>o, which is defined as So = 1, Si = 
wi(\) and satisfies the recursion relation, for I > 2, 

Si = Si-i + wmi(X)Ji - 1, (3.1) 

i.e., the Poisson random variables Poi(ioj\,fj(A)) appearing in the recursion for Z\ in (2.6) are replaced with 
their (random) weights wmi(^)- We shall first show that Si and Z\ are quite close: 
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Lemma 3.1. Uniformly in m > 0, 



sap \Zi-Si\ = P (m 1 / 2 ). 



Km 



(3.2) 



Proof. We have that (Zj — Si)i>o is a martingale w.r.t. the filtration .F; = a{{Mi)\ =1 ). Therefore, by the 
Doob-Kolmogorov inequality [21, Theorem (7.8.2), p. 338], for any M > 0, 



(sup \Zi - Si\ > MV^i) < 
V Km ) mM z 



E 



S n 



(3.3) 



Now, 



E 



Zm S m I 



m 

E^E[|Z m - S m \ 2 | (MO^i]] = e[^ M; (A)J, 



z=i 



(3.4) 



<e[^m) Mi (A)] = mi/„(A) = m(l + o(1)), 
i=i 



by (2.30). This proves the claim. 

We proceed by investigating the scaling limit of (t?/)j>x. For this, we define 

c(") _ „— OL Q 

o t — n Ot n p, 



□ 



(3.5) 



where we recall the rounding convention right below (2.36). 

We shall prove that, in the sense of convergence in the Ji-Skorokhod topology on the space of cadlag 
functions on R + , 

{St)t>o, (3.6) 



& n) )t>o 



which shall be enough to prove Theorem 2.4. Indeed, to see that (3.6) implies Theorem 2.4, we note that 
by Lemma 3.1, for every t = o(n^ 4l ~ T ^^ T ~ 1 ^), 



sup|Z< n) -S^ n) \ = P (Vtn^-D) = o P (l). 



s<t 



(3.7) 



We continue with the proof of (3.6). We shall prove that, due to (2.9) and Lemma 3.1, the first hitting 
time of Sg n) of is close to n _p |C<(l)|. We note that, by (3.1), 



5, = Wl (X) + w iW -(*-!) = wi(A) +J2wi(\)ll n \l) -(I -I) 



i=2 



where 



Using that 



we can rewrite Si as 



lt\l) = l {ieV (n )} , With V, W =UW. 

j=2 



^n(A) = £ 



Wi(X)wi 



(3.8) 



(3.9) 



(3.10) 



16 n 



= Wi(A) 



(f-l)wi(A)wi 

/ 



(n) (j - l)w,- 



] + K(A) - 1)(Z - 1). 



(3.11) 



i=2 
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Now we take I = tn p , use v n {X) — 1 = On + o(n v ) (recall (2.30) and (2.10)), and we recall from 
(1.5)-(1.6) that, for i such that n/i — > oo, 

Wi = [l- F]- x {i/n) = b(n/i) a (l + o(l)), (3.12) 

where b = c" and c F is defined in (1.6). As a result, by (3.11), 

h 2 n mt 

= n- a S tnP = b - -—t + E n- a Wl (X) [T^\tn p ) ~ n~ a — ] +9t + o(l), (3.13) 

where we write \x n = £ n /n = E[W] + o{l). 

We proceed by showing that the sum in (3.13) is predominantly carried by the first few terms. Define 



Mt' K) = E n ~° w ^ K n) (0 - i± ^ 1 ] ■ ( 3 - 14 ) 

i=K 



We compute the mean and variance of M ; (n ' K) for K large. For the mean, we compute 

ml n ' K) \ = E n-« W% {\) [F(lf\l) = 1) - = £ n-« Wt {\) [(1 - fi)'- 1 - 1 + V-^) . 

i=K i=K 

(3.15) 

Thus, since < 1 - (1 - x) 1 - Ix < {lx) 2 /2, we have that E[M/ n ' A) ] < 0, and 

immi = E n-^(X) [1 - (1 - f) 1 -' - <t^H] < £ n~^ i (A)(^i) 2 (3.16) 

where, here and in the sequel, C > denotes a constant that can change from line to line. By (2.10) and 
the fact that £ n = 0(n), we have that ln a /£ n = 0(/n a_1 ) = @(ln~ p ), so that, uniformly in / < tn p , 

|E[M/ n ' K) ]| < Ct 2 K l -^ a . (3.17) 

To compute the variance of M^ n ' K \ we start by noting that X^ l \l) is the indicator that i G and 
Vj (n) contains the first I marks drawn, where M\ = 1 and the marks (Mj)' =2 are i.i.d. with distribution 
given by (2.4). Therefore, lj- n \l) andXj™'(/) are, for different negatively correlated, so that 

n 

Var(M/ n ' K) ) < E (n- a Wi(X)) 2 Vax(X^(l)). (3.18) 

i=K 

Since T[ n \l) is an indicator, 

Vax(^ B) (0) <E[j| n) (0] <lwi/£ n . (3.19) 
Therefore, when / = tn p , and using that p + a = 1 (recall (2.10)) 

ri p n 

Var(M/"' K) ) < E (^" Q ^(A)) 2 ^y^ < Ci E r3 ° ^ CtET 1 " 30 = o(l), (3.20) 
when .K" — )■ oo, since r £ (3,4), so that a = l/(r — 1) > 1/3. 
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We next observe that (M> )/>i is a supermartingale, since 

n 

= E [^n-^(A)K" ) (Z + l)-jf ) I (4 n) (0W.| 



< ^n-^(A)(l-X^(0)(E[jW(Z + l) | (Zf } (Z)) ieW ] =0. (3.21 



n-^(Aj(l-2rW)^L2r^ + lJ I W v lWie[n]J 
Therefore, by the maximal inequality [21, Theorem 12.6.1, p. 496], 

P( m ax|M<"->| > e) < JglSgQ. (3.22) 

Z<m £ 

We further bound, using Cauchy-Schwarz, 



E[|M^' K) |] < \E[M%' K) )\ + ^Var(M/ n ' if) ). (3.23) 
Thus, by (3.17) and (3.20), and uniformly in m < tn p , 

\Ml n ' K) \ >e)< Cfe^K 1 - 3 ** + ^VCtA' 1 " 3 ". (3.24) 



iry max | 

Since r < 4, we obtain that, uniformly in n, we can take K = K(e) so large that P(max;<m |M, K '| > 
e) < e. 

We denote 

St' K) = b - JL^t + J2 ™- a mW Pt\tnf>) - n~ a ^] + 9t. (3.25) 
Then we obtain the following corollary: 

Corollary 3.2 (Finite sum approximation of i? (n) ). For every e,5,T > ; there exists K > and N > 1 
suc/i that for all n > N , 

P(sup \zl n) - S ( t n ' K) \>5)<e. (3.26) 

t<T 

The above suggests that it suffices to investigate (X^ n) (tn p )) i€ ^ K y. 
Lemma 3.3 (Convergence of indicators). As n ^ oo, for all K > 1, 

(lt\t^))ie [K ],t> m))ie[K],t>o- (3-27) 

As a consequence, for all K > 1, 

(S^ K) )t>o A (St ' K) ) t > , (3.28) 
where the limiting process (<?t )t>o is defined as 

St K) = b - ^t + J2 bi~ a [Ut) ~ ai~ a t] + 6t. (3.29) 

In both statements, — > refers to convergence in the J\-Skorokhod topology on the space of cddldg functions 
on R+. 
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Proof. Convergence of the process for t > follows when the process converges for t G [0, T] for all T > 
(see [9, Lemma 3, page 173]). 

Since (Z\ n) (tn p )) t >o are all indicator processes of the form 

I^{tn p ) = l {Ti <tnr}, (3-30) 
where Tj is the first time that mark i is chosen, it suffices to prove that 

{n~ p Ti) ie[K] -A (£i) ie[j K], (3.31) 

where £7j are independent exponentials with rate ai~ a . For this, in turn, it suffices to prove that, for 
every sequence ti, . . . , tx, 

K 

¥(n- p Ti > U Mi G [AT]) —7- exp ('-a ^ i' a t^j . (3.32) 

i=l 

The latter is equivalent to 

K 

F(lf ] (tin p ) = Vt G [#]) -> P(X l (t i ) = Vt G [#]) = exp (-a^V^Y (3.33) 

i=l 

Now, since the marks are i.i.d., we obtain that 

oo oo 



P(ZfVi) = Vi G [#]) = [JP(M, {t G [A] : Z < m,}) = (l - £ ^). (3.34) 
A Taylor expansion gives that 



Z=l Z=l i: l<nii n 



(jr ) K) = 0V,G[Al)=exp(-^ £ ^ + (l))=exp(-^^p + (l)). (3.35) 



Applying this to = for which 



i=li:m,>i ie[iC] 



'^(1 + 0(1)), (3.36) 



E[W] 



we arrive at the claim in (3.27) with a = 6/E[W]. The claim in (3.28) follows from the fact that, by 
(3.25), s\ n ' K) is a weighted sum of the (I^ n) (tn p )) ie ^, and the (deterministic) weights converge. Thus, 
the continuous mapping theorem gives the claim. □ 

Proof of Theorem 2.4- Again we use that convergence of the process for t > follows when the process 
converges for t G [0, T] for all T > (see [9, Lemma 3, page 173]). By (3.26), with probability 1 — o(l) 
when first n — > oo and then A — > oo, the process (^ (n) )tg[o,T] is uniformly close to {S[ n )te[o,T\- By 
Lemma 3.3, the process (Sl"' K) )t>o converges to (<S^ ; K) )t>o- Now, 

S t -St' K) = Yl bi- a [li(t)-ar a ], (3.37) 

i>K+l 

and similar techniques as used to prove (3.24) can be used to prove that 

P(max|5 < -5 t (oo,if) | > e) < CT 2 e~ 1 K 1 - 3a + ^VCTA 1 " 3 ", (3.38) 

so that again we can take A = K(e) so large that P(max«j' \St — S^'^l > e) < e. This proves the 
claim. □ 
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3.2 Proof of Theorem 2.1 



In this section, we give a proof of Theorem 2.1. We start by looking at the first hitting time of zero of 
the process I \-t Z\, and use the fact that by (2.7), V(l) = inf{Z: Zi = 0}, where we recall that V(l) 
denotes the number of vertex checks performed in exploring the cluster of vertex 1. Recall further that 
C(l) denotes the cluster of vertex 1, |C(1)| the number of vertices in it, and W(l) = Yljeeti) w i weight. 

The proof proceeds as follows. We shall first use Theorem 2.4 and Lemma 3.1 to prove that V(\)n~ p 
converges in distribution to H s (0), where H s (0) denotes the first hitting time of of the process (St)t>o 
(see Corollary 3.4 below). We then prove that V(l)n~ p , \C(l)\n~ p and W(l)n~ p have identical scaling 
limits, by looking at the contribution due to the second term in (2.9) for |C(l)|n~ p , and a similar com- 
putation for W(l)n~ p (see Lemma 3.6 below). We then complete the proof of Theorem 2.1, both for 
|C(l)|n _p and for W(l)n~ p . Finally, in Proposition 3.7, we state and prove an auxiliary result concerning 
joint convergence of |C(l)|n _p and the indicators l{ q€ c(i)} for all q. This results is useful in the proof of 
Theorems 1.1 and 1.5, and plays a crucial role in the proof of Theorem 4.1 in the next section, where we 
investigate the scaling limit of several clusters simultaneously. 

By Theorem 2.4 and Lemma 3.1, the process (Z f <n) ) t > , where Z[ n) = n~ a Z tnP , converges in distribu- 
tion to the process (St)t>o- By (3.6), the same applies to (<?t )t>o- Note that 

n - p V{l) = mm{t: Z[ n) = 0} = H (n) (0). (3.39) 

We next prove convergence in distribution of n~~ p V(\): 
Corollary 3.4 (Convergence of hitting times). As n — > oo, 

n- p V{l) -A H s (0). (3.40) 

where 

H s (x) = ini{t:S t <x} (3.41) 
is the first hitting time of level x of (St)t>o- 

Proof. Since the process (St)t>o has only positive jumps [26, Proposition 2.11, in Chapter 6] implies that 
the hitting time of zero is a continuous function a.s. under the probability measure of the limiting process 
on the space of cadlag functions equipped with the Ji-Skorokhod topology. □ 

Lemma 3.5 (St has a density). For all t > 0, St has a density. As a result, the distribution of H s (0) 
has no atoms. 

Proof. We note that St has a density if and only if S^ has, where 

oo 

si = j2r a [im-tr% (3.42) 

and (Xj(t))j>2 are independent indicator processes with rate j~ a . This, in turn, follows when the char- 
acteristic function of Sj. is integrable (see e.g., [21, p. 189]). 
The characteristic function of S' t is given by 

oo 

/ s ,(tf) = E[e^] =ne- r2ai "(l + (e-^ i,? - l)e-^ Q *). (3.43) 
i=2 

Thus, for every j$ > 2, 

oo 

1/4(0)1 < n i 1 + ( e ~ rai " - i >~ rat \- ( 3 - 44 ) 

3>H 
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Next, note that 

|i + (e-r-w _ i) e -ry = e -2j-* sin(r ^)2 + (1 _ e -i-t + cos(r a^ )e -r^)2 

= 1 - 2(1 - e- rQ ')e- rQ *[l - cos(j- a ??)] 

< e _2(l-e-^ Qt )e-^ Qt [l-cos(r- *)] < e -j- a t[l-coB(r a »)] 



so that 
We choose 
so that 

Then we bound 

Next, we use that 
to arrive at 



\f 4 m < e - 4j W~ Q ' [1 - cos(r ^ )] = e-**W. (3.45) 
h = min{j > 2 : &0T Q < tt/2}, (3.46) 
j# = r(260/7T) 1 / a ] A 2 = [(260/vr) r - 1 l A 2. (3.47) 



oo 

a 



[l-coscwr")]. (3.48) 



2 

1 - cos(x) > -x 2 , x G [-|tt, Itt], (3.49) 

7T - 



$(0) > c0 2 ^ i~ 3a , (3.50) 

where c > denotes a positive constant appearing in lower bounds that possibly changes from line to 
line. We arrive at the fact that 

> c0 2 jl~ 3a > cd 2 V r " 2 , (3.51) 

so that (0)| is integrable. To prove that H s (0) has no atoms, note that when F(H s (0) = u) > for 
some u > 0, then, in particular, P(5 U = 0) > 0, which contradicts the fact that S u has a density. □ 

We proceed by showing that the scaling limits of the number of vertex checks of a cluster and the cluster 
size are identical. For this, we shall make use of the following lemma: 

Lemma 3.6 (Number of multiple hits is small). As n — > 00, for any m > 1, 



4£[l-4]]<= + =^P^. (3-52) 

J =2 



Consequently, there exists t n — > 00, such that 



t n nP 

n->j>-4]-^0. (3.53) 

i=2 

Proof. We note that Jj = precisely when M; = 1 or when there exists an I < j and i G [n] such that 
Mz = Mj = i. By independence and (2.4), 

F(Mi = Mj = i)= F{Mi = i)F(Mj = i) = w 2 /£ 2 n . (3.54) 

Therefore, 
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Summing the above inequality over 2 < j < m proves the claim in (3.52). 
For (3.53), we use the Markov inequality to bound 



t n nP t„nP ,, 

J=2 J=2 



whenever t n n a /e n = o(l). Choosing, for example, t n = logn and e n = 1/logn does the trick. □ 



Now we are ready to complete the proof of Theorem 2.1: 
Proof of Theorem 2.1. By Corollary 3.4, n~ p V(l) — > H s (0). In particular, this implies that |C(1)| < 
V(l) < n p t n for any t n — > oo. Therefore, by (2.9), and whp, 



t„nP 

n- p V(l) - n~ p Jj] < n- p \C(l)\ < rT p V{\). (3.56) 

Now, by Lemma 3.6, the difference between the left-hand and right-hand side of (3.56) converges to zero 
in probability, so that also 

n- p \C{l)\ A H s (0). (3.57) 
This completes the proof of Theorem 2.1, and identifies .ffi(O) = H s (0). In the same vein, 

V(l) 

W(l) = E w i = S WM i J i- (3 ' 58) 



ieC(l) 3=1 



Now, by (3.1), for any I > 1, 

As a result, 
so that 



/ 



Y J WM ] J 3 = S l + l. (3.59) 

3=1 

W(1) = V(1) + S V{1) , (3.60) 

n - p W(l) = n- p V{\) + n- p S v{1) . (3.61) 



Finally, n P V(1) — > H s (0), and, since a < p, n p \Syn\\ = o(l)n °|£V(i)l —t 0. This proves that 
n- p W{l) H s (0) as well. □ 

In the next section, where we study the joint convergence of various clusters simultaneously, we shall also 
need the following joint convergence result: 

Proposition 3.7 (Weak convergence of functionals). As n ^ oo, 

(n-"|C(l)|,(l {ffeC(1)} ) ff >i) A (^(OMX^O)))^), (3.62) 

in the product topology, where I q (H s (0)) denotes the indicator that X q {t) = 1 at the hitting time of of 
(St)t>o- Moreover, (i) the random variable H s (0) is non- degenerate; and (ii) the indicators (T q (H s (0))) 9 >2 
are non-trivial in the sense that they take the values and 1 each with positive probability. 

We note that, while the indicator processes (X q {t))t>o are independent for different q, the random 
variables (I q (H s (0))) q >i are not independent since H s (0), the hitting time of of the process (St)t>o, 
depends sensitively on all of the indicator processes. 
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Proof. We shall use a randomization trick. Indeed, let (Nj"\t))t>o be a sequence of independent Poisson 
processes with rate Wj/£ n . Let 

Tj = mi{t: N(t) = j}, where N(t) = ^ N^ n) (t). (3.63) 

j'e[n] 

Then, i i— )■ iV(t) is a rate 1 Poisson process, and we have that (recall (3.11)) 

Si = S' Tl , (3.64) 
where the continuous-time process (S[) t >o is defined by 

s , _ wi{x) _ mm mmt) + ± WlW[1 _ s^W] + MA) _ 1)JV(t) . ( 3.65) 

n i=2 n 

By construction, the processes (lrj V («)/ nPt w 1 |)t>0 are independent, and are characterized by the birth 
times 

= inf{t: N< n) (n p t) > 1}. (3.66) 

Again by construction, these birth times are independent for different q>2, and E q has an exponential 
distribution with parameter n p w q /£ n . The parameters of these exponential random variables converge to 

n p w q /£ n -> aq~ a , (3.67) 

where a = c"/E[W], and which are the parameters of the limiting exponential random variables in terms 
of which we can identify I q (t) = ^-{N q (t)>i} = ^-{Exp(aq~ a )<t} ( see (2-48)). By the convergence of the 
parameters, we can couple E q n) with E q = Exp(ag _a ) in such a way that, for every q > 2 fixed, 

F{E q "^E q )=o(l). (3.68) 

Indeed, (3.68) follows by noting that, by (3.67), the density of E q converges pointwise to that of E q , 
which, by [34, (7.3)], implies that we can couple (E q ) n >i to E q in such a way that (3.68) holds. 

Equation (3.68), jointly with the independence of (E q n) ) n >i for different q's, immediately implies that, 
for each K > 1, 

P (V"V*)>i} = X ^ > 0,9 G [K]) = 1 - o(l), (3.69) 

so that we have also, whp, perfectly coupled the entire processes (1, („), „.w 1 -,)t>o and (Z q (t))t>o- In 
particular, this implies that, for every K > 2, 



¥ { 1 {Nt ) (T l )>i}= 1 ^ Vi > l,g € [if] J = 1 - o(l), (3.70) 

and, by construction, Ir^CnWwn = 1 q n) (l). 

Applying the perfect coupling to / = V(l), for which lrjy(»Wwn = l{geC(l)}- This provides a perfect 
coupling between l/ 9e c(l)} an d Zg(Tyn\). We then note that 

n-»\C(l)\^H s (0), (3.71) 
and, since Tj is the birth time of the j th individual in a rate 1 Poisson process, 

sup \n- p T tnP -t\-^0, (3.72) 
Ku 

where, for non-integer tn p , we recall the convention below (2.36). 
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Weak convergence of (l{ 9 gc(l)})g>i in the product topology is equivalent to the weak convergence of 
(l{qgC(i)})<je[m] for any m > 1 (see [30, Theorem 4.29]). Therefore, together with the exact coupling in 
(3.70), this completes the proof of (3.62), since the processes (Ii(t))t>o have a.s. no jump close to H s (0). 

We continue to show the properties of the limiting variables. The random variable H s (0) is non- 
degenerate, since its distribution does not have any atoms. We shall next show that l{gec(i)} i s non- 
trivial. We shall show this only for q = 2, the proof for q > 2 being identical. For this, we use the fact 
that F(H s (0) > K) can be made arbitrarily small by choosing K large, so that 

lim P(2 € C(l)) > F(H s (0) > e,I 2 (e) = 1) > F(H S (Q) > e)F(l 2 (e) = 1) > 0, (3.73) 

n— >oo 

by the Fortuin-Kasteleyn-Ginibre-(FKG)-inequality (see [20, Thm. 2.4]) and the fact that both random 
variables H s (0) and T 2 (s) are monotone in the independent exponential random variables that describe 
the first hit of q for all q > 1, so that both {H s (0) > e} and I 2 (e) = 1 are increasing events. 
Further, 

lim P(2 C(l)) > F(H s (0) < K,1 2 (K) = 0) = F(1 2 (K) = 0) - F(H s (0) > K,1 2 {K) = 0). (3.74) 

n— >oo 

Again by FKG, now using that {H s (0) > K} is an increasing event, while {I 2 (K) = 0} is a decreasing 
event, 

F(H s (0) > K,1 2 (K) = 0) < F{H s {0) < K)F{X 2 {K) = 0), (3.75) 

Thus, 

lim P(2 C(l)) > F(T 2 (K) = 0) - F(H s (0) < K)F{Z 2 {K) = 0) = P(fT s (0) > K)P(X 2 (K) = 0) > 0, 

n— >oo 

(3.76) 

which proves the claim. □ 

Remark 3.8 (Convergence in the uniform topology). In fact, by the proof of Proposition 3.7, we even 
obtain that the weak convergence in Theorem 2.1 holds in the uniform topology. Indeed, the coupling 
obtained in the proof of Proposition 3.7 (see in particular (3.69)) shows that we can couple (SJ n ' )t>o and 
(<S t K) )t>0 such that these processes are whp equal for all t > 0. By (3.26) in Corollary 3.2, (Zj: n) ) t >o 
is close to (Sj: n,K ^)t>o in the uniform topology on [0, T], while (3.38) shows that (St°°' Ky )t>o is uniformly 
close to (St)t>o- This proves the convergence in the uniform topology. 

4 Convergence of multiple clusters: Proof of Theorem 1.1 

In this section, we extend the analysis of one cluster in Section 2 to multiple clusters. The main result is 
as follows: 

Theorem 4.1 (Weak convergence of the cluster of first vertices for r G (3,4)). Fix the Norros-Reittu 
random graph with weights w(X) defined in (1.15). Assume that v = 1 and that (1.6) holds. Then, for all 

AeR, 

(n->\C<(i)\)i>i ^ (HiiO))^ 

for some non- degenerate limit (ilj(0)) . >1 . 

In the remainder of this section, we shall prove Theorem 4.1, and use it to complete the proof of 
Theorem 1.1. 



We let l[ n) = 1, and let 



If = min[n] \ C(l) (4.2) 
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be the minimal element that is not part of C(l), where, for a set of indices AC [n], we let minA denote 
the minimal element of A. To extend the above definitions further, we define, recursively, 

=C<(4 n) ), and V { £ = {jvf\ (4.3) 

Then, we define by 

l£\ = min[n]\2^, (4.4) 

which is the smallest index of which we have not yet explored its cluster. 

Obviously, |C<(i)| = unless i = ij for some j. This prompts us to investigate the weak convergence 
of n _p |2?-' l) |. This will be done by induction on i. The induction hypothesis is that 

in the product topology, for some limiting random variables. Part of the induction hypothesis is that these 
limiting random variables satisfy the following facts: (1) the limiting random variables (-ffj(0))j 6 m are 
non- degenerate, in the sense that the essential support of the random vector (Hj(0))j & w is i-dimensional; 
and (2) the random indicators (i{ g e£>< })je[i],?>i are an non-trivial, in the sense that they take the values 
zero and one, each with positive probability. By construction, l{ ge x><-} = 1 for j < i, so the restriction 
to q > i in condition (2) is the most we can hope for. 

We shall start by initializing the induction hypothesis for j = 1, which follows from Proposition 3.7, 
as we show now. Indeed, we have that T>^ = Z><™' = C(l), so that (4.5) is identical to the statement in 
Proposition 3.7. 

We next advance the induction hypothesis by verifying that (4.5) also holds for j = i + 1. We first 
intuitively explain our approach. The random variable i?j+i(0) shall be the weak limit of We 
shall show that flj_|_i(0) is the hitting time of zero of a process similar to (<St)t>o in Section 2. We now 
start by explaining how this process arises. 

Assume that the induction hypothesis (4.5) holds for i. By (4.5), the index set is the (random) 
set of indices for which 

( l {qeV £) } )q>l — > ( 1 {gec< i })?>i- ( 4 - 6 ) 
Then, we note that, by (4.5), we have that 

4+1 = min {<7 : 1 {q&v { £} = °i J i+l = min{g: ^{qev^} = 0}, (4.7) 

and we see that and are deterministic functions of the sets T><} and P<, ; , respectively. The 
random variable Ij+i is finite, since, for K, Q > 1 large, 

P(4;\ > K) < P(|D^| > Qn p ) + P(j£\ > K, \V^\ < Qn"). (4.8) 

The first probability converges, by (4.5) and the continuous-mapping theorem, to ¥(Hi(0) + • • • + JTj(O) > 
Q), which is small for Q > 1 large. For the second probability in (4.8), and for i < K/2, we can bound 

P(4"\ > K, \V { £ \ < Qn p ) < P(vertex K drawn in Qn p vertex checks) (4.9) 

QnP 

< F(3l < Qn p : M[ = K) <^2^ < CQK~ a , 

i=i ln 

which converges to zero as K -)• oo when Q = K 13 with j3 < a. As a result, we have that 

P(Ji+i >K)= lim P(/?; ) 1 > K) (4.10) 
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is small for K large. 

We conclude that, from the induction hypothesis in (4.5), we obtain the joint convergence 



We now start exploring the cluster of and we need to show that this cluster size, as well as the 
indices in it, converge in distribution. More precisely, the joint convergence in (4.5) for i + 1 (and thus 
the advancement of the induction hypothesis) follows when we prove that, conditionally on T><? , 



n-^SUJi^i^Wj)^!) (^ + i(0),/ l+ i,(l{^< !+1 }) 9 >i). (4.12) 

To prove (4.12), we follow the approach in Section 2 as closely as possible. A crucial observation is that 
after the exploration of T><} and conditionally on it, the remaining graph is again a rank-1 inhomogeneous 
random graph, with (a) vertex set [n] \ and (b) edge probabilities, for u, v G [n] \ 2?<"\ given by 

n — 1 p—w u w v /l n 

PUV — AC 

We now extend the exploration process of clusters described in Section 2.1 to the setting above. As 
in Section 2, we set Zq(i) = 1 and let Z\{i) denote the number of neighbors of the vertex i.e., 

Z 1 (i)= V Poi(w T(n) (\)w j /£ n ) = Poi(w Tin) (\)£ n (i)/i n ), (4.13) 

where we let 

e n (i)= w 3 ( 4 - 14 ) 

be the total weight of vertices outside T><). For I > 2, (Z;(i));>i satisfies the recursion relation 

Zi{i) = Z l _ l {i) + X l {i)-l, (4.15) 

where Xi(i) denotes the number of potential neighbors outside of of the I th vertex which is ex- 
plored. As explained in more detail in Section 2, the distribution of Xi(i) (for 2 < I < n) is equal to 
Poi(tt>jvf ; (j)£ n (i)/£ n )Jz(i), where now the marks (Mi(i))f2 zl are i.i.d. random variables with distribution 
M(i) given by 

¥(M(i)=m)=w m /e n (i), m€[n]\X>£\ (4.16) 

and 

= 1 {Msmi ( i :[}u{M 2 ^,...,M l -i(i)}} (4 - 17) 

is the indicator that the mark Mi(i) has not been found up to time I and is not equal to vertex 
Then, the number of vertex checks V(I^7i) in the exploration of = C < (I^ 1 ) equals 

V(lg 1 )=in£{l:Z l (i) = 0}, (4.18) 

and 

= (1 P<H?ih wi(0=^} Wi' (4 ' 19) 

while l /r (n) _(n) i = 1- We again note that 

\T>gi\ = \C<{lg 1 )\<V(lg 1 ), (4.20) 

while 



n 



'[n4fi)-l^ffil] -^o. ( 4 - 21 ) 
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which can be proved along the lines of the proof of Lemma 3.6. This gives us a convenient description of 
all the random variables needed to advance the induction hypothesis. 

In order to prove the weak convergence of n~ p V{IjV]), we again investigate the scaling limit of the 
process (Zi(i))i>o. For this, we define So(i) = 1, Si(i) = w ( n ) (X)£ n (i) / l n and, for / > 2, 

Si(i) = 5,_i(») + w Mlii) (\)Mr)t n (i)/e n - i. (4.22) 

Then, as in Lemma 3.1, it is easy to show that, conditionally on , the processes (5/(i))/>o and 
(Zi(i))i> are uniformly close. Denote by £>- n) = X><™ 3 U {ii+i} the union of all vertices explored in the 
first i clusters and the minimal element not in the first i clusters. Then, we rewrite 

Si® = w jM (A)^ + Y J w M]{i) ^J 3 {i) -(I -I) (4.23) 



ie[n]\B^ 

where 

A n) (l;i) = M3 j <UM j (i)= q} - (4.24) 

We further rewrite the above as 

S ,( i ) = w J ,,(A)^) + £ «.W^W('iO-^)+l( £ (4.25) 

qe{n]\Bl n) ie[n]\Bl n) 

We note that we can rewrite the last sum, using (2.30), as 

w up 1 
(1 + AfT") Y, / " 1 = K(A) - 1) - (1 + An'") £ / 

9G[n]\B< n) q€B\ n) 



w 2 



in 



V ~ E -f+°( n ' V )- ( 4 - 26 ) 



geB™ 



In turn, the sum can be approximated by 

W 2 . 



E -r = dn ' 7] E ^ 2a (l + Op(l)), (4.27) 

where d = c| a /E[VF]. Denoting 



= ci ^ g~ 2c \ (4.28) 



9 eB i Cn) 



we therefore have that 



S l (i)=w lin) £ ^+ £ W /^(l!f\^)-p^)+l(6-D^)n^ + o r (ln-^ (4.29) 

»+i *-n i n **nv') 

qe[n]\B\ n > 

We conclude that we arrive at a similar process as when exploring C(l), apart from the fact that: (i) 
fewer vertices are allowed to participate; and (ii) a negative drift — Df- n ' is introduced; and (iii) a factor 
= 1 + o P (l) is introduced. 
We proceed by investigating the convergence of D^: 
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Lemma 4.2 (Weak convergence of random drift). As n — > oo, and assuming (4.11), 

of'AfljE v~ 2a > ( 4 - 3 °) 

ge©<iU{/i + i} 

where (T>< x ,Ii + i) is the weak limit 0/ (Z><* , iS) given in (4.11). 

Proof. We start by bounding P(q G X></), for g > large. We shall first prove that the probability 
that \V { <}\ < nPK is 1 - o(l) when if > grows large. Indeed, by [24, Theorem 1.2], we have that, 
|C ma x| = maxj |C<i| < Lun^ with probability 1 — o(l), as w — > oo. Thus, \V ( £\ < nP{iu) = n PR, with 
probability 1 — o(l) as K — > oo, when we take K = uji. Thus, denoting 

£l% = {\V^\<n»K}, (4.31) 

we have that 

" ' f 'j fj i f ,L -i,K 

since, independently of the choices before, the probability of drawing q is at most w q / Ylj>KnP w r Now 



g \ {il n) y i= i} n £ lt) < » p #^ — 2 — > ( 4 - 32 ) 



wj =e n {l + o{l)) =E[W]n(l + o(l)). (4.33) 

j>KnP 



Thus, for some C > 0, 
so that 



E 



G D|» \ {Jj n) }} =1 } n £g) < CKq~ a , (4.34) 
V <T 2c %(J <iQ~ 2a + CKQ 1 ~* a , (4.35) 



where the first contribution arises from the (at most i) values of q = for j£ [i + 1] for which 7j n) > Q, 
and the second contribution from the q G" {/j }y 6 [ i+ i]. 

Equation (4.35) implies that the weak convergence of -D 4 - n) follows from the weak convergence of 

£ g" 2 ", (4.36) 
<zeS? n) : 9 <Q 

which, in turn, follows from (4.11) and the continuous mapping theorem. □ 
Now we are ready to complete the proof of Theorem 4.1: 

Proof of Theorem 4.1. We start by setting the stage for the weak convergence of processes needed 
to advance the induction hypothesis as formulated in (4.12). Define 

Z^{i) = n- a Z tnP (i), St\i) = n~PS tnP (i), (4.37) 

and 

S t ®=bir£+ Y aq- a {Z q {t)-htq- a )+t{c-D l ). (4.38) 

q ev< i u{i i+1 } 

Then, using Lemma 4.2, the proof of Theorem 2.1 can easily be adapted to prove that n _p |P^ > 1 | — — > 
Hi+i(0), where Hi + \(0) is the hitting time of of (St(i))t>o, and where a,b,c are given by a = c"/E[W], 
b = and c = 6. 
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Indeed, in more detail, we shall work conditionally on T><- . The proof of Theorem 2.1 reveals 
that the main contribution to {St{i))t>o and {S[ n] (i))t>o arises from the vertices q E [K]. Now, since 
(^•{ aeT) (n)y)a,e[K} is a sequence of discrete random variables taking a finite number of outcomes and that 

converge in distribution, we have that its probability mass function converges pointwise. By [34, (6.3) on 
p. 16], this implies that we can couple (1 („) -,) a ^\K] t° (^-{a€V <i+1 })ae[K] i n such a way that 

H^haev^ lW] + (Hae<D <i+1 })ae[K]) = o(l). (4.39) 

Therefore, whp, there is a perfect coupling between the elements in [K] of T>^ +1 and 2?<j+i. When this 
is the case, we can basically think of the set of summands in (4.25) as being deterministic and follow the 
proof of Theorem 2.1 verbatim. 

Further, the proof of Proposition 3.7 can be adapted to prove the joint convergence of 

(n-l^ahCl^WjW) A (H i+1 (0),(l g (H i+1 (0))) q >i)- (4-40) 

Together with the induction hypothesis, this proves that (4.5) also holds for all j < i + 1, and, thus, we 
have advanced the induction hypothesis. This, in particular, proves Theorem 4.1. The proof for cluster 
weights follows in an identical way as the convergence proof of re~ p W(l) in the proof of Theorem 2.1. □ 
We finally complete the proof of Theorem 1.1: 

Proof of Theorem 1.1. Weak convergence of (|C (i) |n _p )j>i in the product topology is equivalent to 
the weak convergence of (|C(i)|^ _p )ie[m] for any rn > 1 (see [30, Theorem 4.29]). In turn, by Proposition 
1.6, this follows from the convergence in distribution of (|C<(*)l n_p )ie[m] for all m. The latter follows from 
Theorem 4.1. Since, whp, again by Proposition 1.6, (|C W |n _p )j g [ m ] is equal to the largest m components 
of (|C w |ra -p )j e [ m ], we have identified 

(7i(A))i>i = (# w (0))i>i, (4.41) 

where (-ff (i )(0))j>i is (i/j(0))i>i ordered in size. This completes the proof of Theorem 1.1, and identifies 
the limiting random variables. □ 

5 Proof of Theorems 1.5 and 1.6 

In this section, we prove Theorems 1.5 and 1.6, using the results in Theorems 2.1 and 4.1, as well as 
Proposition 3.7. We shall first start with a proof of Theorem 1.6. Note that, combining parts (a) and (b) 
in Theorem 1.6, we obtain that, with high probability as K becomes large, the largest m clusters are all 
among the first (|C<(i)|)j g [^]. This explains why we start the cluster exploration from the vertices with 
the highest weights. 

Proof of Theorem 1.6. (a) For maxj>^ |C<(i)| > en p to occur, we must have that there exists a cluster 
using the vertices in [n] \ [K] such that (1) |C<(i)| > en p , and (2) the cluster C<(i) is not connected to 
any of the vertices in [K] . 

By construction, the graph restricted to the vertices in [n] \ [K] is again a Norros-Reittu model, with edge 
probabilities pij = 1 — e _u,itu j/^ n , for all i,j S [n] \ [K]. However, no vertex in [n] \ [K] found to be in the 
cluster C{i) is allowed to have an edge to any of the vertices in [K]. We shall now bound this probability, 
making use of the results in [24] . 
With 

n 

Z>k = ~^{\C(v)\>k,C(v)n[K]=0}> ( 5 -l) 
v=l 
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we have 

r[K] 
<>h 



¥\7 [K] ] 1 n 

(^|c<(»)|>fc)=P(zgi>fc)<-L^l = - ^ p(|c( v )|>fc,c( v )n[K] = 0). (5.2) 



Denote by C [A ' ] (i») the cluster of v restricted to the vertices [n] \ [K]. Then, due to the independence of 
disjoint sets of edges, and the fact that C(v) n [-PC] = only depends on edges between [K] and [n] \ [K], 
while |C [if] (t>)| > k depends only on edges between pairs of vertices in [n] \ [K], we obtain 

F(\C(v)\ > k,C(v) n [K] = 0) = E[p(C(v) n[K]=0\ CW(v))l {lcl K ]{v)] > k} ] , (5.3) 

= E[e-^wI«l W / fcl{|cMWfejk} - 

where, similarly to (1.17), we define 

K 

W lK] {v)= w * and w m = J2 w r ( 5 - 4 ) 

aecM(«) i=i 

We split depending on whether W iK ^(v) > k/2 or not, to obtain 



1 n 

P(max|C<(z)| > k) < - Y e~ w m k/( - 2en) ¥(\C [K] (v)\ > k) (5.5) 

*~ v=K+l 

n 

+ - V P(|C [A1 (w)| > A;, W [K1 (u) < jfe/2). (5.6) 

K 

v=K+l 

For the first term we compute that, for some C > 0, 

K 

W lK] > c F ^(n/i) Q (l + o(l)) > Cn a KP. (5.7) 

i=i 

Thus, when k = k n = en p , we obtain, for some u > 0, and using a + p = 1 (see (2.10)), 
J_ e -w [if] fc„/(2M ^ F(|C [ ^ ] > kn) < ^~ ueKP Y HP K] {v)\ > K) < e~ U£KP -^r Y p c ( w )I ^ k ^ 

t>G[n] u£[nj wG[n] 

= e-^"^P(|C(V)|>M, (5-8) 

where V £ [n] is a vertex chosen uniformly at random from [n]. By [24, Proposition 2.4(a)], there exists 
a constant a% < oo such that 

H\C(V)\ > k n ) < ai [kn 1/{T - 2) + (enVn-Hl/l-Y/M) < 0l (A-l/(r-2) +n -l/(r-l)) > (5.9) 

so that, for k = k n = en p with e < 1 and with = 2ai, 

^-P(|C(y)| > fc n ) < a' ie - (r - 1)/(T - 2) n^ (5.10) 

Therefore, the term in (5.5) is bounded by 

e-^aie"^- 1 )/^- 2 ). (5.11) 

When we pick K = K(e) sufficiently large, we can make this as small as we wish. 

We continue with the term in (5.6), for which we use a large deviation argument. We formulate this 
result in the following lemma: 
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Lemma 5.1 (Large deviations for cluster weights). For every k = o(n) and K = o(n), there exists a 
J > such that 

¥(3v: \C m {v)\ > k,W [K] (v) < k/2) < ne~ Jk . (5.12) 

Proof. When |C [K1 (t>)| > k, then W^(v) is stochastically bounded from below by the sum Yli=i w v(i)> 
where (v(i)) k =l is the sized-biased ordering of [n], i.e., for every j (i)(s)) sg rj_ 1 i, 

F(v(i) = j | («( S )) a6 [v-i]) = v " • ( 5 - 13 ) 

See [8, Section 2, Lemma 2.1] for more details about the size-biased reordering. Indeed, each time we 
draw a random mark and, conditionally on this mark not being one that has been found earlier as well 
as on all the marks found so far, it will be equal to j with the probability in (5.13). When |C [ifl (v)| > k, 
we must draw a vertex that we have not seen yet, a total of at least k times. 

We apply the size-biased reordering to the vertex set [n] \ [K]. Then, for each i and conditionally on 
(i>(s)) sg [j_i|, the random variable w v u\ is stochastically bounded from above by the random variable W[ 
with distribution 

11) ' 

F(Wl = Wj ) = - \ , je[n}\[i-l + K], (5.14) 

i.e., we have removed the vertices with the largest i — 1 + K weights. As a result, the random variables 
(W-)i>i are independent. Now take k > very small, and note that, whenever k — 1 + K < nn and for 
every i < k, W[ is stochastically bounded from above by a random variable W^{k) with distribution 

F(Wt\n) = wj) = , J G [n] \ H, (5-15) 

where the random variables (W 4 {k))^ =1 are i.i.d. Now take k > so small that 

n 2 
W ■ 

E[Wf >(«)] = E 1 v^r— > 3/4. (5.16) 

j=nn x 

Then, 



>(3v: \C lK] (v)\ > k,W lK] (v) < < H\C [K \v)\>k,W [K] {v)<k/2) (5.17) 

v=K+l 
k 

< n¥(Y^W t (n) (K) < k/2). 



i=l 

Intuitively, since E[Wj (n) (K)] « v n v = 1 for k > small, the Chernoff bound proves that P(£)i=i Wj n \K) < 
k/2) is exponentially small in k, so that the term in (5.6) is exponentially small. We now make this in- 
tuition precise. 

By the Chernoff bound, for each # > 0, and by the fact that (W i (fiO)i e [M are i.i.d. random variables, 
we have 

k 

'J2W^(k) < k/2) < e ^/ 2 E[e-^-i^ (n> W] = (e^n^O?)) *, (5.18) 

i=l 

where 

<l> n , K (0)=E[e-* w ™M] (5.19) 

denotes the Laplace transform of W\(k). By (5.18), it suffices to prove that there exists a i9 > such that, 
uniformly in n sufficiently large, i?/2 + log (pn^i^) < 0. This is what we shall show now. By dominated 
convergence, for each fixed $ > 0, 

log^O?) ->■ log^(i?) = logEfe-^W], (5.20) 
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where 

P(W(k) <x) = E[[l - F}- 1 ^) | U > k), (5.21) 

and U is a uniform random variable on [0, 1]. As a result, the distribution of U conditionally on U > k 

is uniform on [k, 1]. Let U K denote a uniform random variable on [k, 1], so that W{k) = [1 - 

Then, W(k) has mean E[W(«)] > 3/4 and bounded variance a 2 (since W(k) < [1 — F]~ 1 (k) < oo a.s.). 

Therefore, a Taylor expansion yields that, for fixed k > 0, 

log < -30/4 + a 2 K ® 2 + o(0 2 ). (5.22) 

Now, fix a i? > so small that 

0/2 - 30/4 + a 2 2 < -0/6, (5.23) 
and then iV so large that, for all n > N, 

log n , K (0) < log + 0/12. (5.24) 

Then, indeed, for n > N, since > 0, 

0/2 + log n>/t (0) < -0/6 + 0/12 = -0/12 < 0, (5.25) 

so that 

e^n, K (0) < e-" /12 , (5.26) 

which, in turn, implies that 

n 

Y,^(P K] (v)\ > k,W lK] (v) < k/2) < ne- M l 12 . (5.27) 

v=l 

When n —¥ oo, this proves the claim for J = 0/12. □ 

To prove Theorem 1.6(a), we apply Lemma 5.1 to the term in (5.6), which is then bounded by Q~^( £nP ) 
when we take k = en p . 
(b) We denote by 

n 

z >k = l{|C(«)|>fc} (5.28) 
v=l 

the number of vertices that are contained in connected components of size at least k. In [24], the random 
variable Z> fc has been used in a crucial way to prove probabilistic bounds on |C max |. We now slightly 
extend these results. 

We shall prove that, for all e > sufficiently small, there exist constants hi, C such that 

W{Z> snP < hnPe- 1 /^) < Ce 2 '^. (5.29) 

We first note that it suffices to prove (5.29) when v n < 1 — ETn"* 7 . Indeed, the random variable Z> En p is 
increasing in the edge occupation statuses, and, therefore, we may take A < so that —A > K to achieve 
the claim. 

We shall use a second moment method. By [24, Proposition 2.4(b)], there exists 02 = a,2{K) such that 

E[Z> £ „p] > nP(|C(V)| > sn p ) > a 2 n p e- 1/(T - 2) , (5.30) 
where V is chosen uniformly from [n]. Therefore, when we take b 2 = 0,2/2, 

Pfe„p < 6 2 n"e- 1 /( T - 2 )) < P{Z> £nP < E[Z> en p]/2) . (5.31) 
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We take e > small, and bound, by the Chebychev inequality, 



4Var(Z ; 



{ Z> _ enP < E[Zw]/2) < ^jgy . (-32! 



By [24, Proposition 2.2] and [24, Proposition 2.5 and (2.22)], uniformly in k > 1, 

Var(Z> fc ) < nE[|C(V)|] < n 1+,? = r/. (5.33) 

As a result, we obtain 



< E[^]/2) < aig , 2/(T _ 2)n2p = Ce^\ (5.34) 



which is small when e > is small. We conclude that, with probability at least 1 — o e (l), where o e (l) 
denotes a function that is o(l) uniformly in n as e 1 0, 

Z> £n p > E[Z> enP ]/2 > ^ e -V(-2) n P. (5 .3 5) 

Since, by [24, Theorem 1.2], |C max | < e~ l l 2 n p with probability at least 1 — o e (l), there are, again with 
probability at least 1 — o e (l), at least 

^ £ -l/ir-2) nP/{e -l/2 nP) = Ce l/a-l/(r-2) (5 36) 

clusters of size at least en p . Since 1/2 — l/(r — 2) < 0, the number of clusters of size at least en p tends to 
infinity when e \. 0. By part (a), whp for K > 1 large, these clusters will be part of (|C<(i)|) i6 r^i when 
K = K(e) > 1 is sufficiently large. □ 

We now complete the proof of Theorem 1.5: 

Proof of Theorem 1.5. We use Proposition 3.7, and note that the limiting variables are all non-trivial 
(i.e., they are equal to or 1 each with positive probability). This proves (1.27). The proof of (1.28) is 
similar, noting that |C<(z)| equals |C max | with strictly positive probability. □ 

6 Proof of Theorem 1.3 

In this section, we shall prove Theorem 1.3 on the largest subcritical clusters. We shall extend the result 
also to the ordered weights of subcritical clusters as formulated in Theorem 1.4, which shall be a crucial 
ingredient in the proof of Theorem 1.2, which is given in Section 7 below. 

We shall prove that Theorem 1.3 holds both for W (j ) as well as for |C W )|. Indeed, it shall also follow 
from the result that whp, W (j) = YlieC,^ Wi ' ^ e -> ^ e J th largest cluster weight is the weight of the j th 
largest cluster, as claimed in Theorem 1.4. 

To prove this scaling, we shall prove that, when the weights are equal to w(X n ) as defined in (1.15), 
and when A n — > — oo, 

|A n |n-"|C(j)| Cj, \\ n \n- p W{j) -A c h (6.1) 

where we recall that 

Cj = c%r a = lim n~ a Wj . (6.2) 



Since j h-» Cj is strictly decreasing, this means that, whp, C(j) = C<(j). Thus, this also implies that 
whp, for all j < m, C(j) = C^). Then (6.1) proves the result for the ordered cluster sizes and weights. 

Recall the definitions of T, T(i) and their weights wt and fc) introduced in Section 2.2, where also 
their moments are computed in Lemma 2.3. We make frequent use of these computations. The proof of 
Theorem 1.3 consists of four key steps, which we shall prove one by one. 
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Asymptotics of mean cluster size and weight of high-weight vertices. In the following lemma 
we investigate the means of \C(j)\ and W(j): 

Lemma 6.1 (Mean cluster size and weights). As n —> oo, for every j G N fixed, and when X n — > — oo 
such that v n (X n ) — > 1, 

7/1 ■ 7/1 ■ 

n\C(j)\] = z Wl + o(l)), E[W(i)] = Wl + o(l)). (6.3) 

Proof. By the fact that \C(j)\ and T(j) can be coupled so that \C(j)\ < T(j) a.s., we obtain that 

It) ' 

E[\C(j)\]<E[T(j)} = - L— , (6.4) 

the latter equality following from Lemma 2.3(c). A similar upper bound follows for E[W(j)] now using 
Lemma 2.3(d). 

For the lower bound, we rewrite 

E[\c(j)\]=nT(j)]-nm-\c(j)\]. (6.5) 

Now, for a n = n p S> E[T(j)], we bound 

E[T(j) - \C(j)\] < E[T(j)l {T{j>an} }+E[[T(j) - \C(j)\]l {Tman} }. (6.6) 
By Lemma 2.3(c), the first term in (6.6) is bounded by 

E[T(j)l {TU>an} ] < ±-E[T(j) 2 } (6.7) 

J_/7l + W i ^ , Wj{l + V n {\ n )) Wj 1 ^ 3 

a n V ^l-u n {X n y ^ (i_^(A n )) 2 ^ (l-z, n (A n )) 3 4 ff, 1 

ie[n\ 

The first two terms in (6.7) are o(wj/(l — f n (A n ))) since v n (X n ) = 1 — n'^Xn + o(n _?? |A n |) by (2.30) and 
the fact that X n — > — oo, so that 

Wj /(l - v n (X n )) < cn^lXn]- 1 = cnP^nt 1 = o{n p ) = o(a n ), (6.8) 

since a + rj = p (recall (2.10)). The last term in (6.7) is bounded by 

_ cn 3a-l-p-2r,\ Xn \-2_ (g g) 



'Vv— 1 



1 - Vn(X n ) a n (l - f n (A„)) 2 1 - v n (X n ) 

By (2.10), 3a — 1 — p — 2r\ = 3(r — 4)/(r — 1) < 0, so that also the second term is o(wj/(l — v n {X n ))). 

For the second term in (6.6), we note that differences between T(j) and \C(j)\ arise due to vertices 
which have been used at least twice in T{j). Indeed, as explained in more detail in Section 2.2, the law of 
\C(j)\ can be obtained from the branching process by removing vertices (and their complete offspring) of 
which the mark has already been used (see the description of the cluster exploration in Section 2.1 and 
the relation to branching processes described in Sections 2.1-2.2). Thus, when we draw vertex i twice, 
then the second time we must thin the entire tree that is rooted at this vertex with mark i. The expected 
number of vertices in the tree equals E[T(z)], so that we arrive at 

E[[TC7-)-|C(j)|]l {T(i) < 0n} ]< — |C(i)|]l{T(j)<a n }l{mark i drawn at least twice} (6.10) 

ie[n] 

On 

< E[T(i)] P(mark i drawn at times si,S2). 

i€[n] s 1 ,s 2 =l 
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Now, i can only be chosen at time s± when T(J) > si — 1, which is independent of the event that the 
mark i is chosen at times s±,S2- Therefore, 



an 2 

mm - \cmn {m <^}} < E E i T «] E p ™ > * - i)|f- (e.n) 

ie[n] s 1 ,s 2 =l n 

On 2 

<a„E p ( T c?')>«i-i)E E [ T «i5" 

si=l ie[n] n 



< a n E[T(j)] E[T(i)]^-. 
This is o(wj/(l — v n {X n ))) when A ra — > — oo, since 

i(l-z^n(A„)) |A 

This completes the proof for E[|C(j')|]. The proof for WT(j) is similar. Indeed, we split 



ie[n] 71 ie[n] 



EK (J) - W(j)] < EK(,-)l{r(,>«n}] + E tK(i) - W0")]l{r(,-)<«n}]- ( 6 - 13 ) 
The first term is now bounded by 

E[w m t {T{j)>an} ) < ^E[^ T(j) T(i)], (6.14) 

which we can again bound using E[w T(j) T(j)] < E[it;2 ] + E[T(j) 2 } together with Lemma 2.3(a-b). 
Further, 

E[[w T(j ) - W(j)]l{ T (j)< an }] < E E [[ U 'T(i) - W(i)]l{T(i)<a n }l{mark i drawn at least twice} (6-15) 

ie[n] 

< E[u>T(j)] P(mark i drawn at times 51,52) 
»6[n] si,S2=l 

< a nE [T(,)] E = a - (1 - u (X ) )2 E #> 

i6[n] n 1 WnJJ £ n 

so that 

E[W(j)] > E[w m ] - ±-E[w T{j) T(j)] - a ^. E |f • (6-16) 

ie[n] n 

We bound Efu^^T^')] < E[u>|,^] +E[T(j) 2 ]. Now we can simply follow the argument for E[|C(j)|]. □ 

Cluster size and weight of high weight vertices are concentrated. We note that, by the stochas- 
tic domination and the fact that E[|C(j)|] = ^ \ (1 + °(1))> we have 

Var(|C(j)|) < Var(T(j)) + o(E[T(j)} 2 ). (6.17) 

By Lemma 2.3(a), 

VarTO)) = Sll±i^M + St (' £ „?) _ ( ^ ), (6.18) 

l-2/ n (A n ) (1 - v n (K)Y ~ J ^ (l-^„(A n )) 2 

je[n] 
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since j is fixed and 

'i Y, U}3 ^ 1 ~ MK))- 1 = £in 3a - 1+V = o(n a ) = o( Wj ). (6.19) 



For wx(j) the argument is identical. We conclude that, for j fixed, Var(|C(j)|) = o(E[|C(j)|] 2 ) and 
Var(W(j)) = o(E[W(j)] 2 ), so that 

\c(j)\ p Mi) p , , fi 2m 

e[|c(j)|] ' mm 1 ' j 

and then Lemma 6.1 completes the proof of (6.1). 

Cluster weight sums. We start by proving a convenient result relating the cluster weights W(j') and 
W (j): 

Lemma 6.2 (Cluster weight properties), (a) For every integer m > 2, 



(b) For every i,j 6 [n], 
Proof, (a) We compute 



E w ^ m = E ^wor -1 - (6-2i) 

ie[n] ie[n] 
E[W(*)W(i)l {i ^ i} ] < E[W(i)]E[W(j)]. (6.22) 



EW<(j) m =E E II U; ^ ]L {^eC(i 1 )V S =2,... 1 m,minC(i 1 )=i} (6-23) 
je[n] jE[ri\ h,...,i m 8=1 

m 

= E II u,i « 1 {*.ee(»i)v*=2,...,m} = E ^i w (^) m_1 - 

i\,...,i m s=l uG[n] 

(b) We write out 

E[w cw w eU) t {uh > j} ] = ^wrf(' ^k,j< — ► j) < ^2w k W[P(i i — ► k)F(j <— > I) (6.24) 

k,l k,l 

= E[W(i)]E[W(j)], 

by the BK-inequality (see [20, Section 2.3]). □ 

Only high-weight vertices matter. We start by proving that the probability that, for K > 1, there 
exists aj>K such that W<(j) > en p /\\ n \ is small. Since, for all j < K, we have that \X n \n~ p W(j) — > Cj, 
we have that, for all i < m and m such that c m > e, W(j) = Wy). 

Recall that W [x] (j) is the weight of the cluster of j in the random graph only making use of the 
vertices in [n] \ [K], and let W< K '(j) = W [x] (j) when j is the minimal element in C [x] (j). If there exists a 
j > K such that W m (j) > en p /\X n \, then 

E W f(i) 3 = E ^W [K] 0') 2 > i^U^ (6-25) 

3>K j>K 1 n| 

where we have used Lemma 6.2 for the first equality. Since 

4 > E w h ( 6 -26) 
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we see that this random graph is stochastically bounded by the random graph having weights w lK \ where 

v[ K] = when j < K and w[ K] = Wj 
j j j 



w[ K] =0 when j < K and w[ K] = Wj otherwise. By the Markov inequality, 



P(3j > K: W<(j) > enP/\X n \) < ^n" 3 ^ ^ Wj E[W lK] {j) 2 ). (6.27) 

6 3>K 

We note that we can again stochastically dominate |C [A] (j)| by T lK] (j) and W lK] (j) by w T [K](j\, where 
now the offspring distribution is equal to Jfj = Xf v) \^ Mi>K ^ (recall Section 2.1). Therefore, by Lemma 
2.3(d), we obtain that 

W [K] 2 W l ' <] 1 

E[H«Gf ] < EK„ % ,l = ( T -^) + J^XZ £<^'> S )' (6 ' 28) 

«£[n] 

where 



wjl {j>K} , e ] =E« ] )V^. (6.29) 
j'e[n] 

It is not hard to see that, for each K > 1 fixed, as n — > oo, 

2 i n 

Substitution of the bound (6.28) in the right-hand side of (6.27) and performing the sum over j gives that 

< j— ^(i + T ^y) X>? £ «-«(„7|A„|) 3 , (0.31) 

so that 

P(3j > K: W<(j) > en p /\X n \) < |A n | V^^CiT^"^/^^) 3 = CK 1 ' 3 ^ 3 , (6.32) 

which can be made arbitrarily small by taking K = K(e) large. 

We complete this section by proving that the probability that there exists a j > K such that |C<(j)| > 
en p /\\ n \ is small. For this, we use Lemma 5.1, which proves that, whp, if |C<(j)| > en p /\X n \, then also 
VV<(j) > en p /(2|A n |). Thus, the result for cluster sizes follows from the proof for cluster weights. This 
completes the proof of Theorem 1.3. □ 



7 Proof of Theorem 1.2 



In this section, we prove Theorem 1.2. We start by using [4, Proposition 7] to show that the random 
graph multiplicative coalescent converges (recall Lemma 1.7). 



Convergence of the random graph multiplicative coalescent at fixed time. We apply [4, Propo- 
sition 7], which gives conditions to show that, for fixed A £ 1, the random sequence JT (n) (|A n | + A) con- 
verges in distribution to a random variable which has the same distribution as the (0, /?, d)-multiplicative 
coalescent at time A when three conditions are satisfied about the initial state a: (n) = X^(0). To state 
these conditions, we define, for r = 2,3, with x (n) = (x^)j>i, 

3 



39 



Then, the conditions in [4, Proposition 7] are that, as A n — > —oo: 
(a) 

|A n |(|A n |cr 2 (a5 Cri) ) -1)-^ -/9; (7.2) 

(b) 

(n) 

—i—^d j; (7.3) 

(c) 

oo 

|A„|V 3 (^) -A £4 (7.4) 
i=i 

The conditions (a)-(c) above are not precisely what is in [4, Proposition 7], and we start by explaining 
how (a)-(c) imply the conditions for [4, Proposition 7]. Indeed, in [4, Proposition 7], the condition in (a) 
is replaced by G2{x {n) ) — > 0, and the process 

I w ( A (7-5) 

V 2 (a;( n ') ' 

is proved to converge to the realization of a (0, 0, d)-multiplicative coalescent at time A. Under condition 
(a) (and the fact that A n — > —00), (a) implies that l/o2{x i - n) ) = \X n \ — (3 + o(l). Since if (X(t))t is a 
multiplicative coalescent with parameters (0, 0, d), then (X(t — f3))t is a multiplicative coalescent with 
parameters (0,/3,d) (see [4, (13)]), and using the continuity proved in [4, Lemma 27], this proves the fact 
that X (n) (|A n | + A) converges in distribution to a random variable which has the same distribution as a 
(0, f3, d)-multiplicative coalescent at time A. Also, condition (c) is replaced by the condition that 

^lAVrf 3 (76) 
<r 2 (aj(»>) 3 ^ J ' 1 j 

which follows from a combination of (a) and (c). Further, in (a)-(c), we work with convergence in 
probability (as the initial state is a random variable), while in [4, Proposition 7], the initial state is 
considered to be deterministic. This is a minor change. 

In the remainder of this section, we shall show that conditions (a)-(c) hold with f3 = — £/E[W] and 
dj = Cj /E[W]. 

Asymptotics of o"2(a; (n) ). In the following lemma, we state the properties of a"2(x (n) ) that we shall rely 
on. In order to state the result, we recall that 

a 2 (xM) = Y^{x { j*) 2 , (7-7) 



j 

where xj n> = n~ p yV^ )} and where the vertex weights are now given by 



%(0) = (1 + \ n i n n- 2 P) Wj = Wj {X n i n n~ 2 ^) = Wj (X n e n /n), (7.8) 
since 2p — 77 = 1, so that 

w(0) = w(X n £ n /n) = w(E[W}X n )(l + o{l)). (7.9) 

Now, 

a 2 (x^) = n- 2 ^K) = n- 2p Y, W <(tf> ( 7 - 10 ) 
i>i i>i 

and, thus, by Lemma 6.2, 

a 2 ( X M) = n- 2 P Y, W<(i) 2 = n- 2 P £ W] W{j). (7.11) 

je[n] je[n] 

We continue by investigating the mean and variance of the above sum: 
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Lemma 7.1 (Mean and variance of o"2(a: (n) )). When the weights w(X n ) satisfy that v> n (\ n ) < 1 — X n n v , 
then 

(i) 

E [E»-Mi)]= T ^y + °(» 2 ^ 2 ); ("2) 

ie[n] 

(ii) 

Var( u>i>V(0) < 4E[^t] < C(E[«, T ] 4 i £ Wl 4 + EK] 2 E[^]i £ w f). (7.13) 

i£[nl n «e[nl n iefnl 



Proof, (i) We bound 



[ V WiWii)] < E [ V = = t^-. (7.14) 



E| 

ie[nl iefnl 



For the lower bound, we make use of the bound alike in (6.16), 

E[ttfT(i) - W(*)] < E [W« ~ 1 {mark j drawn at least twice} (7-15) 

je[n] 

< ^[ w T(j)] ^2 P( mar k j drawn at times 81,32). 

je[n] si<S2 

Now, there are two contributions, depending on whether s 2 is in the family tree of s\ or not. When it is not, 
then the events {mark j drawn at time s\} and {mark j drawn at time S2} are completely independent, 
and we arrive at 

up'- 

P(mark j drawn at times s 1 ,s 2 ) = -± ^ P ( T 00 > s i) < ^-E[T(t) 2 ]. (7.16) 

S\<S2 n Si<S2 n 

When S2 is in the family tree of s%, then we obtain the bound 

lift 

Y P(j chosen at times si,s 2 ) = -± Jj P(T(i) > si)P(s 2 G (7.17) 

Sl<S2 n Sl<S2 

where we denote the tree rooted at s\ by T Sl . Thus, 

J2 ¥ ^ G r «i) ^ = E [ T 0')], ( 7 - 18 ) 

and we arrive at a contribution of 

Y, P(j chosen at times Sl ,a 2 ) < -f ^ P(T(») > Sl )E[T(j)} = -^E[T(i)]E[T(j)}. (7.19) 

S±<S2 S±<S2 

Therefore, 

_[ 

f 2 



11) ■ 

E[w T(l) - W(i)] < EWiWl +E[T(i)]E[r(7)]) (7.20) 



E 



1 - ^n(An) f-f. i l n (1 - ^n(A n ))^ f-f t l n 
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Thus, we obtain 

ie[n] 

We bound 



1 ^1 2 1 sr^ w i 

v (\ ) 2^i2~2^ Wi i\- v (x ) 3 

i<E[n\ i€[n\ i€[nj ny nJ j€{n] n ie[n] V M n> je[n\ 



E M E gjr^ < .^r^p ( E '+ C E E f <™> 

< C\\ n \- 3 n 3l i- 2+6a + C|A n |- 4 n 47? - 1+3a: . 

Now, 3?7 - 2 + 6a = 1 < 2p = (r - 2)/(r - 1), since r > 3, so that the first term is o(n 2p /|A n | 2 ). For the 
second term 4r/ — 1 + 3a = 2p + (r — 4)/(r — 1) < 2/9, so this terms is also o(n 2p |A n |~ 2 ). Similarly, 

E ^j^y, E I - E g * d ,, 22 , 

Again, 3r] — 1 + 4a = 2(r — 3)/(r — 1) < 2/>, so also this contribution is o(n 2p |A n |~ 2 ). 
(ii) We shall start by bounding the second moment. For this, we rewrite 



] =^2w il w i2 E[W(i 1 )W(i 2 )}. (7.23) 

Now we split 



E| v 

i£[n] 11,12 



E[W(ti)W(i2)] =E[W(i 1 )W(i2)t {il ^ i2} }+nW(ii)W(i 2 )l {h ^ 2} }, (7.24) 

By Lemma 6.2(b), the second term is bounded from above by E[W(ii)]E[VV(i2)]- Therefore, summing 
over ii,i2, we obtain that 

E[(^^W(i)) 2 ] < E[ ^ Wi W(i)] 2 + ^ Wil » J2 E[W(ii)Wfe)l {nW!2} ], (7.25) 



so that 



Varfj^ =E[rj^w i W(i)) -e[^w;W(z)] (7.26) 

iG[n] iG[n] i£[n] 

< ^» ilWi2 E[W(n)W(i 2 )l {il w«: 2 }] = Yl «*E[W(i) 3 ] 
iijia ie[n] 

< ^^E^] =4E[^ 3 ]. 

ie[n] 

The upper bound on Eftu^,] follows as in the proof of Lemma 2.3. □ 

Check of convergence conditions. We conclude that we are left to prove that conditions (a), (b) 
and (c) in (7.2)-(7.4) hold. We shall prove these conditions in the order (b), (c) and (a), condition (a) 
being the most difficult one. 

Condition (b) follows from (6.1) and condition (a), as we show now. Substituting (7.9) into (6.1), we 
obtain that 

x f = n-?W U) = * (1 + op(l)) = (1 + o P (l)K/|A n |, (7.27) 
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where dj = Cj/E[W}. Further, the first order asymptotics in condition (a) proves that | A n |o"2(a; (n) ) 1, 
so that the factor l/a 2 (x'- n ^) in condition (b) can be replaced by a multiplication by |A n |. We conclude 
that |A n |xj 0) — ^ dj, where dj = Cj/K[W], as required. 

For condition (c), we apply similar ideas, and start with 

|A n |V 3 (a^) = (|A n |n-"W<(i)) 3 . (7.28) 
J'£[n] 

The summands for j > K can be bounded using Lemma 6.2 by 

£ (|A n |n~"W<(j)) 3 = (|A n |n-") 3 £ Wj W™(j) 2 , (7.29) 

j>K j>K 

which is small in probability by the Markov inequality and (6.31). The summands for j < K converge in 
probability by (6.1). Thus, condition (c) follows from (6.1) and (6.31). 

We continue with condition (a), which is equivalent to the statement that 

a 2 (x^) = -^--^ + o P (\X n \- 2 ), (7.30) 

I I *n 

where /3 = -£/E[W\. 

We shall prove (7.30) by a second moment method. We first identify, by Lemma 6.2(a), 

a 2 (x (n) ) = n~ 2p W<(i) 2 = n- 2 P ^ w*W(i). (7.31) 

je[n] ie[n] 

Thus, in order to prove (7.30), it suffices to show that 

e[J2 mm)} = n2p (i A «r 1+ Eiwii A ™r 2 + °(i A «r 2 ))> ( 7 - 32 ) 



and 



Var(^ WiW{i)) = o(n 4p \\ n \~ 4 ). (7.33) 



Indeed, by (7.32), we have that, for n sufficiently large, 

p(| ( T 2 (^)-|A n |- 1 -^|A ri |- 2 | > £ |A„|- 2 ) <F{\a 2 (x^)-E[a 2 (x^)]\>e\X n \- 2 /2), (7.34) 

which, by the Chebychev inequality is bounded by 

*2(x {ny ) - \Xn\- 1 + ^TjlAnl" 2 ! > e|A n |- 2 ) < ^f- Var (a 2 (x <»> )) = o(l), (7.35) 

by (7.33). Thus, (7.30) follows from (7.32) and (7.33). 

To prove (7.32), we apply Lemma 7.1, in the setting that 

MK) = v n (l + \J n n- 2p ) = 1 + \ n inn- 2p + Cn"" + ofa""), (7.36) 

so that, by Lemma 7. 1 (i) , 

itS 1 " MXn) 

= v n (\ n )l n {\\ n \l n n- 2p -C,n^ + o(n^))- 1 + o{n 2p \\ n \- 2 ) 



\\ n \- 1 n 2p +-^-n- r >\\ n \- 2 + o(\X n \- 2 n 2 P), 

I n| E[W] 
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which proves (7.32) with /3 = ~C/E[W]. 
By Lemma 7.1(h), 



Var(E Wl W(i)) < ^(eK] 4 ^ E wf + K[w T ] 2 E[w 2 T ]j- E wf) = o^A" 4 ), (7.38) 



i€\n\ n i£\n\ n i£\n\ 



precisely when both terms in the middle inequality satisfy this bound. We complete the proof by checking 
these estimates. The first contribution is bounded by 

' nni^ 3 ^ 1 = °(« 4p IAn|- 4 ), (7.39) 



4(1 - MA„)) 4 4- 1 - I A. 



since 4a + 3rj — 1 = 2(r — 2)/(r — 1) = 2p < 4p. The second contribution, instead, is bounded by 

1 / E w lY ^ T3^™ 6Q+5r? ~ 2 = °(- 4p i A «r 4 )' ( 7 - 4 °) 



£2(1 - ^ n )5 V - | An 



since 6a + 5?? — 2 = (3r — 7)/(r — 1) < 4p = 4(t — 2)/(t — 1). This proves the required concentration for 
<72(a3 (n) ) and hence completes the proof of Theorem 1.2 for cluster weights and for any fixed A. □ 



Convergence of the finite-dimensional distributions random graph multiplicative coalescent. 

So far, we have proved the convergence of X (n) (|A ra | + A) for a fixed time A. By [4, Lemma 26], there exists 
an eternal multiplicative coalescent with the same marginal for every A. By the strong Feller property 
of multiplicative coalescents proved in [2], as well as [4, Lemma 27], the convergence of X (n) (|A n | + Ai) 
implies that the future finite-dimensional distributions (X (n) (|A n | + A/))| c _ 1 converge in distribution to 
the finite-dimensional distributions of the eternal multiplicative coalescent. This completes the proof of 
the convergence of the finite-dimensional distributions in Theorem 1.2 for cluster weights. 



Convergence of cluster sizes from cluster weights. By the adaptation of Theorem 1.1 to cluster 
weights in Theorem 1.4, we obtain that W<(j) = |C< (j ) | (1 + o P (1) ) , so that the result immediately follows 
for the cluster sizes. 
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